REVISTA
Applied Sciences

TODAS

Inicio / Applied Sciences / Vol: 11 Par: 5 (2021) / Artículo

ARTÍCULO

TITULO

AraSenCorpus: A Semi-Supervised Approach for Sentiment Annotation of a Large Arabic Text Corpus

Ali Al-Laith

Muhammad Shahbaz

Hind F. Alaskar and Asim Rehmat

Resumen

At a time when research in the field of sentiment analysis tends to study advanced topics in languages, such as English, other languages such as Arabic still suffer from basic problems and challenges, most notably the availability of large corpora. Furthermore, manual annotation is time-consuming and difficult when the corpus is too large. This paper presents a semi-supervised self-learning technique, to extend an Arabic sentiment annotated corpus with unlabeled data, named AraSenCorpus. We use a neural network to train a set of models on a manually labeled dataset containing 15,000 tweets. We used these models to extend the corpus to a large Arabic sentiment corpus called ?AraSenCorpus?. AraSenCorpus contains 4.5 million tweets and covers both modern standard Arabic and some of the Arabic dialects. The long-short term memory (LSTM) deep learning classifier is used to train and test the final corpus. We evaluate our proposed framework on two external benchmark datasets to ensure the improvement of the Arabic sentiment classification. The experimental results show that our corpus outperforms the existing state-of-the-art systems.

Palabras claves

corpus annotation - Arabic sentiment analysis - semi-supervised learning - self-learning - neural networks - deep learning

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 11 Parte: 5 (2021)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Applied Sciences
Algorithms
AI

DOI

https://doi.org/10.3390/app11052434

Artículos similares

Detection of Cyberattacks and Anomalies in Cyber-Physical Systems: Approaches, Data Sources, Evaluation

Acceso

Olga Tushkanova, Diana Levshun, Alexander Branitskiy, Elena Fedorchenko, Evgenia Novikova and Igor Kotenko

Cyberattacks on cyber-physical systems (CPS) can lead to severe consequences, and therefore it is extremely important to detect them at early stages. However, there are several challenges to be solved in this area; they include an ability of the security... ver más

Revista: Algorithms

SMPT: A Semi-Supervised Multi-Model Prediction Technique for Food Ingredient Named Entity Recognition (FINER) Dataset Construction

Acceso

Kokoy Siti Komariah, Ariana Tulus Purnomo, Ardianto Satriawan, Muhammad Ogin Hasanuddin, Casi Setianingsih and Bong-Kee Sin

To pursue a healthy lifestyle, people are increasingly concerned about their food ingredients. Recently, it has become a common practice to use an online recipe to select the ingredients that match an individual?s meal plan and healthy diet preference. T... ver más

Revista: Informatics

A Wrapped Approach Using Unlabeled Data for Diabetic Retinopathy Diagnosis

Acceso

Xuefeng Zhang, Youngsung Kim, Young-Chul Chung, Sangcheol Yoon, Sang-Yong Rhee and Yong Soo Kim

Large-scale datasets, which have sufficient and identical quantities of data in each class, are the main factor in the success of deep-learning-based classification models for vision tasks. A shortage of sufficient data and interclass imbalanced data dis... ver más

Revista: Applied Sciences

A Semi-Supervised Approach to Sentiment Analysis of Tweets during the 2022 Philippine Presidential Election

Acceso

Julio Jerison E. Macrohon, Charlyn Nayve Villavicencio, X. Alphonse Inbaraj and Jyh-Horng Jeng

With the increasing popularity of Twitter as both a social media platform and a data source for companies, decision makers, advertisers, and even researchers alike, data have been so massive that manual labeling is no longer feasible. This research uses ... ver más

Revista: Information

Influential Attributed Communities via Graph Convolutional Network (InfACom-GCN)

Acceso

Nariman Adel Hussein, Hoda M. O. Mokhtar and Mohamed E. El-Sharkawi

Community search is a basic problem in graph analysis. In many applications, network nodes have certain properties that are important for the community to make sense of the application; hence, attributes are associated with nodes to capture their propert... ver más

Revista: Information

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas