ARTÍCULO
TITULO

Revisiting Gradient Boosting-Based Approaches for Learning Imbalanced Data: A Case of Anomaly Detection on Power Grids

Maya Hilda Lestari Louk and Bayu Adhi Tama    

Resumen

Gradient boosting ensembles have been used in the cyber-security area for many years; nonetheless, their efficacy and accuracy for intrusion detection systems (IDSs) remain questionable, particularly when dealing with problems involving imbalanced data. This article fills the void in the existing body of knowledge by evaluating the performance of gradient boosting-based ensembles, including gradient boosting machine (GBM), extreme gradient boosting (XGBoost), LightGBM, and CatBoost. This paper assesses the performance of various imbalanced data sets using the Matthew correlation coefficient (MCC), area under the receiver operating characteristic curve (AUC), and F1 metrics. The article discusses an example of anomaly detection in an industrial control network and, more specifically, threat detection in a cyber-physical smart power grid. The tests? results indicate that CatBoost surpassed its competitors, regardless of the imbalance ratio of the data sets. Moreover, LightGBM showed a much lower performance value and had more variability across the data sets.

 Artículos similares

       
 
Maya Hilda Lestari Louk and Bayu Adhi Tama    
As a system capable of monitoring and evaluating illegitimate network access, an intrusion detection system (IDS) profoundly impacts information security research. Since machine learning techniques constitute the backbone of IDS, it has been challenging ... ver más

 
Viera Maslej-Kre?náková, Martin Sarnovský and Júlia Jacková    
The work presented in this paper focuses on the use of data augmentation techniques applied in the domain of the detection of antisocial behavior. Data augmentation is a frequently used approach to overcome issues related to the lack of data or problems ... ver más
Revista: Future Internet

 
Maya Hilda Lestari Louk and Bayu Adhi Tama    
Classifier ensembles have been utilized in the industrial cybersecurity sector for many years. However, their efficacy and reliability for intrusion detection systems remain questionable in current research, owing to the particularly imbalanced data issu... ver más

 
Hugo Queiroz Abonizio, Janaina Ignacio de Morais, Gabriel Marques Tavares and Sylvio Barbon Junior    
Online Social Media (OSM) have been substantially transforming the process of spreading news, improving its speed, and reducing barriers toward reaching out to a broad audience. However, OSM are very limited in providing mechanisms to check the credibili... ver más
Revista: Future Internet

 
Sanjiwana Arjasakusuma, Sandiaga Swahyu Kusuma and Stuart Phinn    
Machine learning has been employed for various mapping and modeling tasks using input variables from different sources of remote sensing data. For feature selection involving high- spatial and spectral dimensionality data, various methods have been devel... ver más