ARTÍCULO
TITULO

Experimental Evaluation: Can Humans Recognise Social Media Bots?

Maxim Kolomeets    
Olga Tushkanova    
Vasily Desnitsky    
Lidia Vitkova and Andrey Chechulin    

Resumen

This paper aims to test the hypothesis that the quality of social media bot detection systems based on supervised machine learning may not be as accurate as researchers claim, given that bots have become increasingly sophisticated, making it difficult for human annotators to detect them better than random selection. As a result, obtaining a ground-truth dataset with human annotation is not possible, which leads to supervised machine-learning models inheriting annotation errors. To test this hypothesis, we conducted an experiment where humans were tasked with recognizing malicious bots on the VKontakte social network. We then compared the ?human? answers with the ?ground-truth? bot labels (?a bot?/?not a bot?). Based on the experiment, we evaluated the bot detection efficiency of annotators in three scenarios typical for cybersecurity but differing in their detection difficulty as follows: (1) detection among random accounts, (2) detection among accounts of a social network ?community?, and (3) detection among verified accounts. The study showed that humans could only detect simple bots in all three scenarios but could not detect more sophisticated ones (p-value = 0.05). The study also evaluates the limits of hypothetical and existing bot detection systems that leverage non-expert-labelled datasets as follows: the balanced accuracy of such systems can drop to 0.5 and lower, depending on bot complexity and detection scenario. The paper also describes the experiment design, collected datasets, statistical evaluation, and machine learning accuracy measures applied to support the results. In the discussion, we raise the question of using human labelling in bot detection systems and its potential cybersecurity issues. We also provide open access to the datasets used, experiment results, and software code for evaluating statistical and machine learning accuracy metrics used in this paper on GitHub.

 Artículos similares

       
 
Qiang Liu, Rui Han and Yang Li    
Idle bandwidth resources are inefficiently distributed among different users. Currently, the utilization of user bandwidth resources mostly relies on traditional IP networks, implementing relevant techniques at the application layer, which creates scalab... ver más
Revista: Future Internet

 
Guojin Wang, Xin Zhuo, Shenbin Zhang and Jie Wu    
The frame-unit bamboo culm structure system offers a novel approach to bamboo structure, combining advantages like reduced construction times and simplified joint designs. Despite its benefits, there is limited research on its mechanical properties and c... ver más
Revista: Buildings

 
Majid Niazkar, Margherita Evangelisti, Cosimo Peruzzi, Andrea Galli, Marco Maglionico and Daniele Masseroni    
The first flush (FF) phenomenon is commonly associated with a relevant load of pollutants, raising concerns about water quality and environmental management in agro-urban areas. An FF event can potentially transport contaminated water into a receiving wa... ver más
Revista: Water

 
Li Pan, Guoying Wu, Mingwu Zhang, Yuan Zhang, Zhongmei Wang and Zhiqiang Lai    
The functionality of rivers and open diversion channels can be severely impacted when the epipelic algae group that grows on concrete inclined side walls, which are typical of urban rivers, joins the water flow. This study aims to increase the long-dista... ver más
Revista: Water

 
Shieh-Kung Huang, Jin-Quan Chen, Yuan-Tao Weng and Jae-Do Kang    
Continuous and autonomous system identification is an alternative to regular inspection during operations, which is essential for structural integrity management (SIM) as well as structural health monitoring (SHM). In this regard, online (or real-time) s... ver más
Revista: Buildings