REVISTA
Applied Sciences

TODAS

Redirigiendo al acceso original de articulo en 19 segundos...

Inicio / Applied Sciences / Vol: 12 Par: 16 (2022) / Artículo

ARTÍCULO

TITULO

Generation of Controlled Synthetic Samples and Impact of Hyper-Tuning Parameters to Effectively Classify the Complex Structure of Overlapping Region

Zafar Mahmood

Naveed Anwer Butt

Ghani Ur Rehman

Muhammad Zubair

Muhammad Aslam

Afzal Badshah and Syeda Fizzah Jilani

Resumen

The classification of imbalanced and overlapping data has provided customary insight over the last decade, as most real-world applications comprise multiple classes with an imbalanced distribution of samples. Samples from different classes overlap near class boundaries, creating a complex structure for the underlying classifier. Due to the imbalanced distribution of samples, the underlying classifier favors samples from the majority class and ignores samples representing the least minority class. The imbalanced nature of the data?resulting in overlapping regions?greatly affects the learning of various machine learning classifiers, as most machine learning classifiers are designed to handle balanced datasets and perform poorly when applied to imbalanced data. To improve learning on multi-class problems, more expertise is required in both traditional classifiers and problem domain datasets. Some experimentation and knowledge of hyper-tuning the parameters and parameters of the classifier under consideration are required. Several techniques for learning from multi-class problems have been reported in the literature, such as sampling techniques, algorithm adaptation methods, transformation methods, hybrid methods, and ensemble techniques. In the current research work, we first analyzed the learning behavior of state-of-the-art ensemble and non-ensemble classifiers on imbalanced and overlapping multi-class data. After analysis, we used grid search techniques to optimize key parameters (by hyper-tuning) of ensemble and non-ensemble classifiers to determine the optimal set of parameters to enhance the learning from a multi-class imbalanced classification problem, performed on 15 public datasets. After hyper-tuning, 20% of the dataset samples are synthetically generated to add to the majority class of each respective dataset to make it more overlapped (complex structure). After the synthetic sample?s addition, the hyper-tuned ensemble and non-ensemble classifiers are tested over that complex structure. This paper also includes a brief description of tuned parameters and their effects on imbalanced data, followed by a detailed comparison of ensemble and non-ensemble classifiers with the default and tuned parameters for both original and synthetically overlapped datasets. We believe that the underlying paper is the first kind of effort in this domain, which will furnish various research aspects to with a greater focus on the parameters of the classifier in the field of learning from imbalanced data problems using machine-learning algorithms.

Palabras claves

machine learning algorithm - majority class - minority class - imbalance problem - parameter hyper-tuning - synthetic sample

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 12 Parte: 16 (2022)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Applied Sciences
Aerospace
Acoustics

DOI

https://doi.org/10.3390/app12168371

Artículos similares

Power Control of Reed?Solomon-Coded OFDM Systems in Rayleigh Fading Channels

Acceso

Younggil Kim

Power control in an RS-coded orthogonal frequency division multiplex (OFDM) system with error-and-erasure correction decoding in Rayleigh fading channels was investigated. The power of each symbol within a codeword was controlled to reduce the codeword e... ver más

Revista: Information

Enhanced Readability of Electrical Network Complex Emergency Modes Provided by Data Compression Methods

Acceso

Aleksandr Kulikov, Pavel Ilyushin and Anton Loskutov

Current microprocessor-based relay protection and automation (RPA) devices supported by IEC 61850 provide access to a large amount of information on the protected or controlled electric power facility in real time. The issue of using such information (Bi... ver más

Revista: Information

Acoustic Evidence of Shallow Gas Occurrences in the Offshore Sinú Fold Belt, Colombian Caribbean Sea

Acceso

Ana María Osorio-Granada, Bismarck Jigena-Antelo, Juan Vidal-Perez, Enrico Zambianchi, Edward G. Osorio-Granada, Cristina Torrecillas, Jeanette Romero-Cozar, Hermann Leon-Rincón, Karem Oviedo-Prada and Juan J. Muñoz-Perez

High-resolution seismic analysis and bathymetry data, used in the Offshore Sinú Fold Belt (OSFB), have revealed seabed and sub-surface anomalies, which were probably caused by the presence of shallow gas within the sedimentary records. Shallow gas is wid... ver más

Revista: Journal of Marine Science and Engineering

Shale Gas Exploration and Development Potential Analysis of Lower Cambrian Niutitang Formation and Lower Silurian Longmaxi Formation in Northwestern Hunan, South China, Based on Organic Matter Pore Evolution Characteristics

Acceso

Yanan Miao, Pengfei Wang, Xin Li, Haiping Huang, Can Jin and Wei Gao

Shale gas production is obviously higher within the Silurian Longmaxi Formation than that of the Cambrian Niutitang Formation according to the drilling test results in the northwest Hunan area. To clarify the reasons behind this variation, core samples f... ver más

Revista: Journal of Marine Science and Engineering

Experimental Characteristics of Hydrocarbon Generation from Scandinavian Alum Shale Carbonate Nodules: Implications for Hydrocarbon Generation from Majiagou Formation Marine Carbonates in China?s Ordos Basin

Acceso

Yiqing Wang, Yaohui Xu, Junping Huang, Jianglong Shi, Heng Zhao, Qingtao Wang and Qiang Meng

The hydrocarbon source rocks of the marine carbonates of the Ordovician Majiagou Formation in the Ordos Basin are generally in the high-overmature stage and are, therefore, not suitable for hydrocarbon thermal simulation experiments. Their hydrocarbon ge... ver más

Revista: Journal of Marine Science and Engineering

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas