REVISTA
Big Data and Cognitive Computing

TODAS

Redirigiendo al acceso original de articulo en 21 segundos...

Inicio / Big Data and Cognitive Computing / Vol: 7 Par: 1 (2023) / Artículo

ARTÍCULO

TITULO

Effect of Missing Data Types and Imputation Methods on Supervised Classifiers: An Evaluation Study

Menna Ibrahim Gabr

Yehia Mostafa Helmy and Doaa Saad Elzanfaly

Resumen

Data completeness is one of the most common challenges that hinder the performance of data analytics platforms. Different studies have assessed the effect of missing values on different classification models based on a single evaluation metric, namely, accuracy. However, accuracy on its own is a misleading measure of classifier performance because it does not consider unbalanced datasets. This paper presents an experimental study that assesses the effect of incomplete datasets on the performance of five classification models. The analysis was conducted with different ratios of missing values in six datasets that vary in size, type, and balance. Moreover, for unbiased analysis, the performance of the classifiers was measured using three different metrics, namely, the Matthews correlation coefficient (MCC), the F1-score, and accuracy. The results show that the sensitivity of the supervised classifiers to missing data differs according to a set of factors. The most significant factor is the missing data pattern and ratio, followed by the imputation method, and then the type, size, and balance of the dataset. The sensitivity of the classifiers when data are missing due to the Missing Completely At Random (MCAR) pattern is less than their sensitivity when data are missing due to the Missing Not At Random (MNAR) pattern. Furthermore, using the MCC as an evaluation measure better reflects the variation in the sensitivity of the classifiers to the missing data.

Palabras claves

data quality - data completeness - missing patterns - imputation techniques - supervised - classifiers - performance measures

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 7 Parte: 1 (2023)

MATERIAS

INFRAESTRUCTURA

REVISTAS SIMILARES

Future Internet
ISPRS International Journal of Geo-Information
Big Data and Cognitive Computing

DOI

https://doi.org/10.3390/bdcc7010055

Artículos similares

A Novel Traffic Flow Reduction Method Based on Incomplete Vehicle History Spatio-Temporal Trajectory Data

Acceso

Bowen Yang, Zunhao Liu, Zhi Cai, Dongze Li, Xing Su, Limin Guo and Zhiming Ding

In order to improve the effect of path planning in emergencies, the missing position imputation and velocity restoration in vehicle trajectory provide data support for emergency path planning and analysis. At present, there are many methods to fill in th... ver más

Revista: ISPRS International Journal of Geo-Information

Missing Data Imputation in the Internet of Things Sensor Networks

Acceso

Benjamin Agbo, Hussain Al-Aqrabi, Richard Hill and Tariq Alsboui

The Internet of Things (IoT) has had a tremendous impact on the evolution and adoption of information and communication technology. In the modern world, data are generated by individuals and collected automatically by physical objects that are fitted wit... ver más

Revista: Future Internet

Reviewing Stranger on the Internet: The Role of Identifiability through ?Reputation? in Online Decision Making

Acceso

Mirko Duradoni, Stefania Collodi, Serena Coppolino Perfumi and Andrea Guazzini

The stranger on the Internet effect has been studied in relation to self-disclosure. Nonetheless, quantitative evidence about how people mentally represent and perceive strangers online is still missing. Given the dynamic development of web technologies,... ver más

Revista: Future Internet

The Determinants of Access to Sanitation: The Role of Human Rights and the Challenges of Measurement

Acceso

Rebecca Schiel, Bruce M. Wilson and Malcolm Langford

Ten years after the United Nation?s recognition of the human right to water and sanitation (HRtWS), little is understood about how these right impacts access to sanitation. There is limited identification of the mechanisms responsible for improvements in... ver más

Revista: Water

Variations in Benthic Macroinvertebrate Communities and Biological Quality in the Aguarico and Coca River Basins in the Ecuadorian Amazon

Acceso

Santiago Cabrera, Marie Anne Eurie Forio, Koen Lock, Marte Vandenbroucke, Tania Oña, Miguel Gualoto, Peter L. M. Goethals and Christine Van der heyden

Adequate environmental management in tropical aquatic ecosystems is imperative. Given the lack of knowledge about functional diversity and bioassessment programs, management is missing the needed evidence on pollution and its effect on biodiversity and f... ver más

Revista: Water

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas