Chinese Cyberbullying Detection Using XLNet and Deep Bi-LSTM Hybrid Model

Shifeng Chen

Jialin Wang and Ketai He

Resumen

The popularization of the internet and the widespread use of smartphones have led to a rapid growth in the number of social media users. While information technology has brought convenience to people, it has also given rise to cyberbullying, which has a serious negative impact. The identity of online users is hidden, and due to the lack of supervision and the imperfections of relevant laws and policies, cyberbullying occurs from time to time, bringing serious mental harm and psychological trauma to the victims. The pre-trained language model BERT (Bidirectional Encoder Representations from Transformers) has achieved good results in the field of natural language processing, which can be used for cyberbullying detection. In this research, we construct a variety of traditional machine learning, deep learning and Chinese pre-trained language models as a baseline, and propose a hybrid model based on a variant of BERT: XLNet, and deep Bi-LSTM for Chinese cyberbullying detection. In addition, real cyber bullying remarks are collected to expand the Chinese offensive language dataset COLDATASET. The performance of the proposed model outperforms all baseline models on this dataset, improving 4.29% compared to SVM?the best performing method in traditional machine learning, 1.49% compared to GRU?the best performing method in deep learning, and 1.13% compared to BERT.

Palabras claves

social media - cyberbullying detection - deep learning - language model

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 15 Parte: 2 (2024)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

DOI

https://doi.org/10.3390/info15020093

Chinese Cyberbullying Detection Using XLNet and Deep Bi-LSTM Hybrid Model

Revistas destacadas