REVISTA
Big Data and Cognitive Computing

TODAS

Redirigiendo al acceso original de articulo en 21 segundos...

Inicio / Big Data and Cognitive Computing / Vol: 8 Par: 2 (2024) / Artículo

ARTÍCULO

TITULO

A Model for Enhancing Unstructured Big Data Warehouse Execution Time

Marwa Salah Farhan

Amira Youssef and Laila Abdelhamid

Resumen

Traditional data warehouses (DWs) have played a key role in business intelligence and decision support systems. However, the rapid growth of the data generated by the current applications requires new data warehousing systems. In big data, it is important to adapt the existing warehouse systems to overcome new issues and limitations. The main drawbacks of traditional Extract?Transform?Load (ETL) are that a huge amount of data cannot be processed over ETL and that the execution time is very high when the data are unstructured. This paper focuses on a new model consisting of four layers: Extract?Clean?Load?Transform (ECLT), designed for processing unstructured big data, with specific emphasis on text. The model aims to reduce execution time through experimental procedures. ECLT is applied and tested using Spark, which is a framework employed in Python. Finally, this paper compares the execution time of ECLT with different models by applying two datasets. Experimental results showed that for a data size of 1 TB, the execution time of ECLT is 41.8 s. When the data size increases to 1 million articles, the execution time is 119.6 s. These findings demonstrate that ECLT outperforms ETL, ELT, DELT, ELTL, and ELTA in terms of execution time.

Palabras claves

big data - unstructured data warehouse - ELT - ETL

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 8 Parte: 2 (2024)

MATERIAS

INFRAESTRUCTURA

REVISTAS SIMILARES

ISPRS International Journal of Geo-Information
Hydrology
Big Data and Cognitive Computing

DOI

https://doi.org/10.3390/bdcc8020017

Artículos similares

Proximal Policy Optimization for Efficient D2D-Assisted Computation Offloading and Resource Allocation in Multi-Access Edge Computing

Acceso

Chen Zhang, Celimuge Wu, Min Lin, Yangfei Lin and William Liu

In the advanced 5G and beyond networks, multi-access edge computing (MEC) is increasingly recognized as a promising technology, offering the dual advantages of reducing energy utilization in cloud data centers while catering to the demands for reliabilit... ver más

Revista: Future Internet

Performance Evaluation of Graph Neural Network-Based RouteNet Model with Attention Mechanism

Acceso

Binita Kusum Dhamala, Babu R. Dawadi, Pietro Manzoni and Baikuntha Kumar Acharya

Graph representation is recognized as an efficient method for modeling networks, precisely illustrating intricate, dynamic interactions within various entities of networks by representing entities as nodes and their relationships as edges. Leveraging the... ver más

Revista: Future Internet

Enhancing Smart City Safety and Utilizing AI Expert Systems for Violence Detection

Acceso

Pradeep Kumar, Guo-Liang Shih, Bo-Lin Guo, Siva Kumar Nagi, Yibeltal Chanie Manie, Cheng-Kai Yao, Michael Augustine Arockiyadoss and Peng-Chun Peng

Violent attacks have been one of the hot issues in recent years. In the presence of closed-circuit televisions (CCTVs) in smart cities, there is an emerging challenge in apprehending criminals, leading to a need for innovative solutions. In this paper, t... ver más

Revista: Future Internet

VST-PCA: A Land Use Change Simulation Model Based on Spatiotemporal Feature Extraction and Pre-Allocation Strategy

Acceso

Minghao Liu, Qingxi Luo, Jianxiang Wang, Lingbo Sun, Tingting Xu and Enming Wang

Land use/cover change (LUCC) refers to the phenomenon of changes in the Earth?s surface over time. Accurate prediction of LUCC is crucial for guiding policy formulation and resource management, contributing to the sustainable use of land, and maintaining... ver más

Revista: ISPRS International Journal of Geo-Information

Identifying Spatial Determinants of Rice Yields in Main Producing Areas of China Using Geospatial Machine Learning

Acceso

Qingyan Wang, Longzhi Sun and Xuan Yang

Rice yield is essential to global food security under increasingly frequent and severe climate change events. Spatial analysis of rice yields becomes more critical for regional action to ensure yields and reduce climate impacts. However, the understandin... ver más

Revista: ISPRS International Journal of Geo-Information

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas