REVISTA
Applied Sciences

TODAS

Inicio / Applied Sciences / Vol: 10 Par: 9 (2020) / Artículo

ARTÍCULO

TITULO

A Comparative Analysis of Machine Learning Methods for Class Imbalance in a Smoking Cessation Intervention

Khishigsuren Davagdorj

Jong Seol Lee

Van Huy Pham and Keun Ho Ryu

Resumen

Smoking is one of the major public health issues, which has a significant impact on premature death. In recent years, numerous decision support systems have been developed to deal with smoking cessation based on machine learning methods. However, the inevitable class imbalance is considered a major challenge in deploying such systems. In this paper, we study an empirical comparison of machine learning techniques to deal with the class imbalance problem in the prediction of smoking cessation intervention among the Korean population. For the class imbalance problem, the objective of this paper is to improve the prediction performance based on the utilization of synthetic oversampling techniques, which we called the synthetic minority over-sampling technique (SMOTE) and an adaptive synthetic (ADASYN). This has been achieved by the experimental design, which comprises three components. First, the selection of the best representative features is performed in two phases: the lasso method and multicollinearity analysis. Second, generate the newly balanced data utilizing SMOTE and ADASYN technique. Third, machine learning classifiers are applied to construct the prediction models among all subjects and each gender. In order to justify the effectiveness of the prediction models, the f-score, type I error, type II error, balanced accuracy and geometric mean indices are used. Comprehensive analysis demonstrates that Gradient Boosting Trees (GBT), Random Forest (RF) and multilayer perceptron neural network (MLP) classifiers achieved the best performances in all subjects and each gender when SMOTE and ADASYN were utilized. The SMOTE with GBT and RF models also provide feature importance scores that enhance the interpretability of the decision-support system. In addition, it is proven that the presented synthetic oversampling techniques with machine learning models outperformed baseline models in smoking cessation prediction.

Palabras claves

smoking - class imbalance - synthetic oversampling - machine learning - decision making - feature importance

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 10 Parte: 9 (2020)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Water
Journal of Science and Applicative Technology
Inteligencia Artificial

DOI

https://doi.org/10.3390/app10093307

Artículos similares

Integrating Life Cycle Assessment in Conceptual Aircraft Design: A Comparative Tool Analysis

Acceso

Kristina Mazur, Mischa Saleh and Mirko Hornung

Early and rapid environmental assessment of newly developed aircraft concepts is eminent in today?s climate debate. This can shorten the decision-making process and thus accelerate the entry into service of climate-friendly technologies. A holistic appro... ver más

Revista: Aerospace

Comparative Analysis of NLP-Based Models for Company Classification

Acceso

Maryan Rizinski, Andrej Jankov, Vignesh Sankaradas, Eugene Pinsky, Igor Mishkovski and Dimitar Trajanov

The task of company classification is traditionally performed using established standards, such as the Global Industry Classification Standard (GICS). However, these approaches heavily rely on laborious manual efforts by domain experts, resulting in slow... ver más

Revista: Information

Time Series Forecasting Utilizing Automated Machine Learning (AutoML): A Comparative Analysis Study on Diverse Datasets

Acceso

George Westergaard, Utku Erden, Omar Abdallah Mateo, Sullaiman Musah Lampo, Tahir Cetin Akinci and Oguzhan Topsakal

Automated Machine Learning (AutoML) tools are revolutionizing the field of machine learning by significantly reducing the need for deep computer science expertise. Designed to make ML more accessible, they enable users to build high-performing models wit... ver más

Revista: Information

Blockchain and Business Process Management (BPM) Synergy: A Comparative Analysis of Modeling Approaches

Acceso

Hamed Taherdoost and Mitra Madanchian

Blockchain technology has become a powerful disruptive force that upends established ideas in several industries. A fascinating point of convergence is that of blockchain technology and Business Process Management (BPM), where the distributed and immutab... ver más

Revista: Information

Enhancing Personnel Selection through the Integration of the Entropy Synergy Analysis of Multi-Attribute Decision Making Model: A Novel Approach

Acceso

Sideris Kiratsoudis and Vassilis Tsiantos

Personnel selection stands as a pivotal component within the domain of human resource management, intrinsically tethered to the quality of the workforce at large. In this research endeavor, we introduce the Entropy Synergy Analysis of Multi-Attribute Dec... ver más

Revista: Information

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas