Inicio  /  Applied Sciences  /  Vol: 12 Par: 21 (2022)  /  Artículo
ARTÍCULO
TITULO

Retweet Prediction Based on Heterogeneous Data Sources: The Combination of Text and Multilayer Network Features

Ana Me?trovic    
Milan Petrovic and Slobodan Beliga    

Resumen

Retweet prediction is an important task in the context of various problems, such as information spreading analysis, automatic fake news detection, social media monitoring, etc. In this study, we explore retweet prediction based on heterogeneous data sources. In order to classify a tweet according to the number of retweets, we combine features extracted from the multilayer network and text. More specifically, we introduce a multilayer framework for the multilayer network representation of Twitter. This formalism captures different users? actions and complex relationships, as well as other key properties of communication on Twitter. Next, we select a set of local network measures from each layer and construct a set of multilayer network features. We also adopt a BERT-based language model, namely Cro-CoV-cseBERT, to capture the high-level semantics and structure of tweets as a set of text features. We then trained six machine learning (ML) algorithms: random forest, multilayer perceptron, light gradient boosting machine, category-embedding model, neural oblivious decision ensembles, and an attentive interpretable tabular learning model for the retweet-prediction task. We compared the performance of all six algorithms in three different setups: with text features only, with multilayer network features only, and with both feature sets. We evaluated all the setups in terms of standard evaluation measures. For this task, we first prepared an empirical dataset of 199,431 tweets in Croatian posted between 1 January 2020 and 31 May 2021. Our results indicate that the prediction model performs better by integrating multilayer network features with text features than by using only one set of features.

 Artículos similares

       
 
Eugenia I. Toki, Jenny Pange, Giorgos Tatsis, Konstantinos Plachouras and Ioannis G. Tsoulos    
Autism Spectrum Disorder is known to cause difficulties in social interaction and communication, as well as repetitive patterns of behavior, interests, or hobbies. These challenges can significantly affect the individual?s daily life. Therefore, it is cr... ver más
Revista: Applied Sciences

 
Jeff Dix, Jeremy Holleman and Benjamin J. Blalock    
A programmable, energy-efficient analog hardware implementation of a multilayer perceptron (MLP) is presented featuring a highly programmable system that offers the user the capability to create an MLP neural network hardware design within the available ... ver más

 
Roberto Fernandez Martinez, Pello Jimbert, Eric Michael Sumner, Morris Riedel and Runar Unnthorsson    
The generation of a virtual, personal, auditory space to obtain a high-quality sound experience when using headphones is of great significance. Normally this experience is improved using personalized head-related transfer functions (HRTFs) that depend on... ver más
Revista: Acoustics

 
Zhonghang Sui, Hui Shu, Fei Kang, Yuyao Huang and Guoyu Huo    
Tunnels, a key technology of traffic obfuscation, are increasingly being used to evade censorship. While providing convenience to users, tunnel technology poses a hidden danger to cybersecurity due to its concealment and camouflage capabilities. In contr... ver más
Revista: Applied Sciences

 
Yongjian Li, He Li, Dazhao Fan, Zhixin Li and Song Ji    
Sea ice extraction and segmentation of remote sensing images is the basis for sea ice monitoring. Traditional image segmentation methods rely on manual sampling and require complex feature extraction. Deep-learning-based semantic segmentation methods hav... ver más
Revista: Applied Sciences