Redirigiendo al acceso original de articulo en 21 segundos...
Inicio  /  Applied Sciences  /  Vol: 12 Par: 5 (2022)  /  Artículo
ARTÍCULO
TITULO

A White-Box Sociolinguistic Model for Gender Detection

Damián Morales Sánchez    
Antonio Moreno and María Dolores Jiménez López    

Resumen

Within the area of Natural Language Processing, we approached the Author Profiling task as a text classification problem. Based on the author?s writing style, sociodemographic information, such as the author?s gender, age, or native language can be predicted. The exponential growth of user-generated data and the development of Machine-Learning techniques have led to significant advances in automatic gender detection. Unfortunately, gender detection models often become black-boxes in terms of interpretability. In this paper, we propose a tree-based computational model for gender detection made up of 198 features. Unlike the previous works on gender detection, we organized the features from a linguistic perspective into six categories: orthographic, morphological, lexical, syntactic, digital, and pragmatics-discursive. We implemented a Decision-Tree classifier to evaluate the performance of all feature combinations, and the experiments revealed that, on average, the classification accuracy increased up to 3.25% with the addition of feature sets. The maximum classification accuracy was reached by a three-level model that combined lexical, syntactic, and digital features. We present the most relevant features for gender detection according to the trees generated by the classifier and contextualize the significance of the computational results with the linguistic patterns defined by previous research in relation to gender.

 Artículos similares

       
 
Zhiqiong Wang, Zican Lin, Shuo Li, Yibo Wang, Weiying Zhong, Xinlei Wang and Junchang Xin    
Alzheimer?s disease (AD) is a progressive, irreversible neurodegenerative disorder that requires early diagnosis for timely treatment. Functional magnetic resonance imaging (fMRI) is a non-invasive neuroimaging technique for detecting brain activity. To ... ver más
Revista: Applied Sciences

 
Abdulwahid Al Abdulwahid    
Ethnic conflicts frequently lead to violations of human rights, such as genocide and crimes against humanity, as well as economic collapse, governmental failure, environmental problems, and massive influxes of refugees. Many innocent people suffer as a r... ver más
Revista: Applied Sciences

 
Jaychand Upadhyay and Tad Gonsalves    
In computer vision applications, gait-based gender classification is a challenging task as a person may walk at various angles with respect to the camera viewpoint. In some of the viewing angles, the person?s limb movement can be occluded from the camera... ver más
Revista: AI

 
Haibin Liao, Li Yuan, Mou Wu, Liangji Zhong, Guonian Jin and Neal Xiong    
Facial recognition.
Revista: Applied Sciences

 
Yong Qi, Mengzhe Qiu, Hefeifei Jiang and Feiyang Wang    
The fingerprint is an important biological feature of the human body, which contains abundant biometric information. At present, the academic exploration of fingerprint gender characteristics is generally at the level of understanding, and the standardiz... ver más
Revista: Applied Sciences