Inicio  /  Information  /  Vol: 12 Par: 10 (2021)  /  Artículo
ARTÍCULO
TITULO

UGRansome1819: A Novel Dataset for Anomaly Detection and Zero-Day Threats

Mike Nkongolo    
Jacobus Philippus van Deventer and Sydney Mambwe Kasongo    

Resumen

This research attempts to introduce the production methodology of an anomaly detection dataset using ten desirable requirements. Subsequently, the article presents the produced dataset named UGRansome, created with up-to-date and modern network traffic (netflow), which represents cyclostationary patterns of normal and abnormal classes of threatening behaviours. It was discovered that the timestamp of various network attacks is inferior to one minute and this feature pattern was used to record the time taken by the threat to infiltrate a network node. The main asset of the proposed dataset is its implication in the detection of zero-day attacks and anomalies that have not been explored before and cannot be recognised by known threats signatures. For instance, the UDP Scan attack has been found to utilise the lowest netflow in the corpus, while the Razy utilises the highest one. In turn, the EDA2 and Globe malware are the most abnormal zero-day threats in the proposed dataset. These feature patterns are included in the corpus, but derived from two well-known datasets, namely, UGR?16 and ransomware that include real-life instances. The former incorporates cyclostationary patterns while the latter includes ransomware features. The UGRansome dataset was tested with cross-validation and compared to the KDD99 and NSL-KDD datasets to assess the performance of Ensemble Learning algorithms. False alarms have been minimized with a null empirical error during the experiment, which demonstrates that implementing the Random Forest algorithm applied to UGRansome can facilitate accurate results to enhance zero-day threats detection. Additionally, most zero-day threats such as Razy, Globe, EDA2, and TowerWeb are recognised as advanced persistent threats that are cyclostationary in nature and it is predicted that they will be using spamming and phishing for intrusion. Lastly, achieving the UGRansome balance was found to be NP-Hard due to real life-threatening classes that do not have a uniform distribution in terms of several instances.

 Artículos similares

       
 
Yifan Wang, Jinglei Xu, Qihao Qin, Ruiqing Guan and Le Cai    
In this study, we propose a novel dynamic mode decomposition (DMD) energy sorting criterion that works in conjunction with the conventional DMD amplitude-frequency sorting criterion on the high-dimensional schlieren dataset of the unsteady flow of a spik... ver más
Revista: Aerospace

 
Rongsheng Li, Jin Xu, Zhixiong Cao, Hai-Tao Zheng and Hong-Gee Kim    
In the realm of large language models (LLMs), extending the context window for long text processing is crucial for enhancing performance. This paper introduces SBA-RoPE (Segmented Base Adjustment for Rotary Position Embeddings), a novel approach designed... ver más
Revista: Applied Sciences

 
Jee-Tae Park, Chang-Yui Shin, Ui-Jun Baek and Myung-Sup Kim    
The classification of encrypted traffic plays a crucial role in network management and security. As encrypted network traffic becomes increasingly complicated and challenging to analyze, there is a growing need for more efficient and comprehensive analyt... ver más
Revista: Applied Sciences

 
Junyi Chen, Yanyun Shen, Yinyu Liang, Zhipan Wang and Qingling Zhang    
Aircraft detection in SAR images of airports remains crucial for continuous ground observation and aviation transportation scheduling in all weather conditions, but low resolution and complex scenes pose unique challenges. Existing methods struggle with ... ver más
Revista: Applied Sciences

 
Yue Zha, Yuanzhi Ke, Xiao Hu and Caiquan Xiong    
Named entity recognition (NER) is particularly challenging for medical texts due to the high domain specificity, abundance of technical terms, and sparsity of data in this field. In this work, we propose a novel attention layer, called the ?ontology atte... ver más
Revista: Applied Sciences