Redirigiendo al acceso original de articulo en 22 segundos...
ARTÍCULO
TITULO

An Advanced Big Data Quality Framework Based on Weighted Metrics

Widad Elouataoui    
Imane El Alaoui    
Saida El Mendili and Youssef Gahi    

Resumen

While big data benefits are numerous, the use of big data requires, however, addressing new challenges related to data processing, data security, and especially degradation of data quality. Despite the increased importance of data quality for big data, data quality measurement is actually limited to few metrics. Indeed, while more than 50 data quality dimensions have been defined in the literature, the number of measured dimensions is limited to 11 dimensions. Therefore, this paper aims to extend the measured dimensions by defining four new data quality metrics: Integrity, Accessibility, Ease of manipulation, and Security. Thus, we propose a comprehensive Big Data Quality Assessment Framework based on 12 metrics: Completeness, Timeliness, Volatility, Uniqueness, Conformity, Consistency, Ease of manipulation, Relevancy, Readability, Security, Accessibility, and Integrity. In addition, to ensure accurate data quality assessment, we apply data weights at three data unit levels: data fields, quality metrics, and quality aspects. Furthermore, we define and measure five quality aspects to provide a macro-view of data quality. Finally, an experiment is performed to implement the defined measures. The results show that the suggested methodology allows a more exhaustive and accurate big data quality assessment, with a more extensive methodology defining a weighted quality score based on 12 metrics and achieving a best quality model score of 9/10.

 Artículos similares

       
 
Shuqiang Xu, Qunying Huang and Zhiqiang Zou    
Location-based social networks (LBSN) allow users to socialize with friends by sharing their daily life experiences online. In particular, a large amount of check-ins data generated by LBSNs capture the visit locations of users and open a new line of res... ver más

 
Gianluca Barbera, Luiz Araujo and Silvia Fernandes    
Social Media Analytics (SMA) is more and more relevant in today?s market dynamics. However, it is necessary to use it wisely, either in promoting any kind of product/brand, or interacting with customers. This requires its effective understanding and moni... ver más

 
Kunlong Hong, Hongguang Wang and Bingbing Yuan    
For the surface defects inspection task, operators need to check the defect in local detail images by specifying the location, which only the global 3D model reconstruction can?t satisfy. We explore how to address multi-type (original image, semantic ima... ver más
Revista: Buildings

 
George Lazaroiu, Mihai Andronie, Mariana Iatagan, Marinela Geamanu, Roxana ?tefanescu and Irina Dijmarescu    
The purpose of our systematic review is to examine the recently published literature on the Internet of Manufacturing Things (IoMT), and integrate the insights it configures on deep learning-assisted smart process planning, robotic wireless sensor networ... ver más

 
Zhicheng Shi, Ding Ma, Xue Yan, Wei Zhu and Zhigang Zhao    
Clustering methods in data mining are widely used to detect hotspots in many domains. They play an increasingly important role in the era of big data. As an advanced algorithm, the density peak clustering (DPC) algorithm is able to deal with arbitrary da... ver más