Redirigiendo al acceso original de articulo en 15 segundos...
Inicio  /  Future Internet  /  Vol: 14 Par: 9 (2022)  /  Artículo
ARTÍCULO
TITULO

Use of Data Augmentation Techniques in Detection of Antisocial Behavior Using Deep Learning Methods

Viera Maslej-Kre?náková    
Martin Sarnovský and Júlia Jacková    

Resumen

The work presented in this paper focuses on the use of data augmentation techniques applied in the domain of the detection of antisocial behavior. Data augmentation is a frequently used approach to overcome issues related to the lack of data or problems related to imbalanced classes. Such techniques are used to generate artificial data samples used to improve the volume of the training set or to balance the target distribution. In the antisocial behavior detection domain, we frequently face both issues, the lack of quality labeled data as well as class imbalance. As the majority of the data in this domain is textual, we must consider augmentation methods suitable for NLP tasks. Easy data augmentation (EDA) represents a group of such methods utilizing simple text transformations to create the new, artificial samples. Our main motivation is to explore EDA techniques? usability on the selected tasks from the antisocial behavior detection domain. We focus on the class imbalance problem and apply EDA techniques to two problems: fake news and toxic comments classification. In both cases, we train the convolutional neural networks classifier and compare its performance on the original and EDA-extended datasets. EDA techniques prove to be very task-dependent, with certain limitations resulting from the data they are applied on. The model?s performance on the extended toxic comments dataset did improve only marginally, gaining only 0.01 improvement in the F1 metric when applying only a subset of EDA methods. EDA techniques in this case were not suitable enough to handle texts written in more informal language. On the other hand, on the fake news dataset, the performance was improved more significantly, boosting the F1 score by 0.1. Improvement was most significant in the prediction of the minor class, where F1 improved from 0.67 to 0.86.

 Artículos similares

       
 
Inta Kotane     Pág. 861 - 870
and profitableness ratios and was used in the solvency evaluation of Latvian enterprises. The practical research was carried out according to the accounting data of the small enterprises of Latvia. The aim of the research is to appraise the use of the fi... ver más

 
Julio Garrote, Ignacio Gutiérrez-Pérez and Andrés Díez-Herrero    
Calibration and validation of flood risk maps at a national or a supra-national level remains a problematic aspect due to the limited information available to carry out these tasks. However, this validation is essential to define the representativeness o... ver más
Revista: Water

 
António Carlos Pinheiro Fernandes, Luís Filipe Sanches Fernandes, Daniela Patrícia Salgado Terêncio, Rui Manuel Vitor Cortes and Fernando António Leal Pacheco    
Interactions between pollution sources, water contamination, and ecological integrity are complex phenomena and hard to access. To comprehend this subject of study, it is crucial to use advanced statistical tools, which can unveil cause-effect relationsh... ver más
Revista: Water

 
Saher Ayyad, Islam S. Al Zayed, Van Tran Thi Ha and Lars Ribbe    
Monitoring of crop water consumption, also known as actual evapotranspiration (ETa), is crucial for the prudent use of limited freshwater resources. Remote-sensing-based algorithms have become a popular approach for providing spatio-temporal information ... ver más
Revista: Water

 
Vanessa de Arruda Souza, Débora Regina Roberti, Anderson Luis Ruhoff, Tamíres Zimmer, Daniela Santini Adamatti, Luis Gustavo G. de Gonçalves, Marcelo Bortoluzzi Diaz, Rita de Cássia Marques Alves and Osvaldo L. L. de Moraes    
Evapotranspiration (ET) is an important component of the hydrological cycle. Understanding the ET process has become of fundamental importance given the scenario of global change and increasing water use, especially in the agricultural sector. Determinin... ver más
Revista: Water