Redirigiendo al acceso original de articulo en 15 segundos...
Inicio  /  Algorithms  /  Vol: 15 Par: 10 (2022)  /  Artículo
ARTÍCULO
TITULO

Towards Sentiment Analysis for Romanian Twitter Content

Dan Claudiu Neagu    
Andrei Bogdan Rus    
Mihai Grec    
Mihai Augustin Boroianu    
Nicolae Bogdan and Attila Gal    

Resumen

With the increased popularity of social media platforms such as Twitter or Facebook, sentiment analysis (SA) over the microblogging content becomes of crucial importance. The literature reports good results for well-resourced languages such as English, Spanish or German, but open research space still exists for underrepresented languages such as Romanian, where there is a lack of public training datasets or pretrained word embeddings. The majority of research on Romanian SA tackles the issue in a binary classification manner (positive vs. negative), using a single public dataset which consists of product reviews. In this paper, we respond to the need for a media surveillance project to possess a custom multinomial SA classifier for usage in a restrictive and specific production setup. We describe in detail how such a classifier was built, with the help of an English dataset (containing around 15,000" role="presentation">15,00015,000 15 , 000 tweets) translated to Romanian with a public translation service. We test the most popular classification methods that could be applied to SA, including standard machine learning, deep learning and BERT. As we could not find any results for multinomial sentiment classification (positive, negative and neutral) in Romanian, we set two benchmark accuracies of ?78% using standard machine learning and ?81% using BERT. Furthermore, we demonstrate that the automatic translation service does not downgrade the learning performance by comparing the accuracies achieved by the models trained on the original dataset with the models trained on the translated data.

 Artículos similares

       
 
Najwa AlGhamdi, Shaheen Khatoon and Majed Alshamari    
User-generated content on numerous sites is indicative of users? sentiment towards many issues, from daily food intake to using new products. Amid the active usage of social networks and micro-blogs, notably during the COVID-19 pandemic, we may glean ins... ver más
Revista: Applied Sciences

 
Bilal Ahmed Chandio, Ali Shariq Imran, Maheen Bakhtyar, Sher Muhammad Daudpota and Junaid Baber    
Deep neural networks have emerged as a leading approach towards handling many natural language processing (NLP) tasks. Deep networks initially conquered the problems of computer vision. However, dealing with sequential data such as text and sound was a n... ver más
Revista: Applied Sciences

 
Charlyn Villavicencio, Julio Jerison Macrohon, X. Alphonse Inbaraj, Jyh-Horng Jeng and Jer-Guang Hsieh    
A year into the COVID-19 pandemic and one of the longest recorded lockdowns in the world, the Philippines received its first delivery of COVID-19 vaccines on 1 March 2021 through WHO?s COVAX initiative. A month into inoculation of all frontline health pr... ver más
Revista: Information

 
Naw Safrin Sattar and Shaikh Arifuzzaman    
Social media, such as Twitter, is a source of exchanging information and opinion on global issues such as COVID-19 pandemic. In this study, we work with a database of around 1.2" role="presentation">1.21.2 1.2 million tweets collected across five weeks ... ver más
Revista: Applied Sciences

 
Retno Kusumaningrum, Titan A. Indihatmoko, Saesarinda R. Juwita, Alfi F. Hanifah, Khadijah Khadijah and Bayu Surarso    
Stunting is a condition in which children experience impaired growth and development, caused by malnutrition, repeated infections, and inadequate psychosocial stimulation. It often remains unrecognized due to a lack of awareness in the community. Therefo... ver más
Revista: Applied Sciences