ARTÍCULO
TITULO

COVID-19 Tweets Classification Based on a Hybrid Word Embedding Method

Yosra Didi    
Ahlam Walha and Ali Wali    

Resumen

In March 2020, the World Health Organisation declared that COVID-19 was a new pandemic. This deadly virus spread and affected many countries in the world. During the outbreak, social media platforms such as Twitter contributed valuable and massive amounts of data to better assess health-related decision making. Therefore, we propose that users? sentiments could be analysed with the application of effective supervised machine learning approaches to predict disease prevalence and provide early warnings. The collected tweets were prepared for preprocessing and categorised into: negative, positive, and neutral. In the second phase, different features were extracted from the posts by applying several widely used techniques, such as TF-IDF, Word2Vec, Glove, and FastText to capture features? datasets. The novelty of this study is based on hybrid features extraction, where we combined syntactic features (TF-IDF) with semantic features (FastText and Glove) to represent posts accurately, which helps in improving the classification process. Experimental results show that FastText combined with TF-IDF performed better with SVM than the other models. SVM outperformed the other models by 88.72%, as well as for XGBoost, with an 85.29% accuracy score. This study shows that the hybrid methods proved their capability of extracting features from the tweets and increasing the performance of classification.

 Artículos similares

       
 
Stefan Helmstetter and Heiko Paulheim    
The problem of automatic detection of fake news in social media, e.g., on Twitter, has recently drawn some attention. Although, from a technical perspective, it can be regarded as a straight-forward, binary classification problem, the major challenge is ... ver más
Revista: Future Internet

 
Amgad Muneer and Suliman Mohamed Fati    
The advent of social media, particularly Twitter, raises many issues due to a misunderstanding regarding the concept of freedom of speech. One of these issues is cyberbullying, which is a critical global issue that affects both individual victims and soc... ver más
Revista: Future Internet

 
Huan Ning, Zhenlong Li, Michael E. Hodgson and Cuizhen (Susan) Wang    
This article aims to implement a prototype screening system to identify flooding-related photos from social media. These photos, associated with their geographic locations, can provide free, timely, and reliable visual information about flood events to t... ver más

 
Adel R. Alharbi and Amer Aljaedi    
Twitter is one of the most popular online social networks for spreading propaganda and words in the Arab region. Spammers are now creating rogue accounts to distribute adult content through Arabic tweets that Arabic norms and cultures prohibit. Arab gove... ver más
Revista: Future Internet

 
Paolino Di Felice and Michele Iessi    
The effectiveness of disaster response depends on the correctness and timeliness of data regarding the location and the impact of the event. These two issues are critical when the data come from citizens? tweets, since the automatic classification of dis... ver más