ARTÍCULO
TITULO

A BI-TECHNICAL ANALYSIS FOR ARABIC STOP-WORDS DETECTION

Driss Namly    
Karim Bouzoubaa    
Abdellah Yousfi    

Resumen

Stop words are defined as words that frequently appear in texts without carrying any significant information. For the Arabic language, existing works suffer from two main drawbacks (i) the use of only proprietary corpus and (ii) the reliance of only the frequency metric. Our approach for automatic Arabic stop-words detection uses a new metric based on a supervised machine learning process and a vector space representation that can be applied to any corpus, taking into account both domain-independent and domain-dependent stop-words. Conducted experiments to evaluate the proposed approach show a significant improvement reaching 91.85% for the detection rate using the F-measure metric.

 Artículos similares

       
 
Aliya Jangabylova, Alexander Krassovitskiy, Rustam Mussabayev and Irina Ualiyeva    
The documents similarity metric is a substantial tool applied in areas such as determining topic in relation to documents, plagiarism detection, or problems necessary to capture the semantic, syntactic, or structural similarity of texts. Evaluated result... ver más
Revista: Computation

 
Ana-Luiza Rusnac and Ovidiu Grigore    
In recent years, a lot of researchers? attentions were concentrating on imaginary speech understanding, decoding, and even recognition. Speech is a complex mechanism, which involves multiple brain areas in the process of production, planning, and precise... ver más
Revista: Applied Sciences

 
Fernando Fernández-Martínez, Cristina Luna-Jiménez, Ricardo Kleinlein, David Griol, Zoraida Callejas and Juan Manuel Montero    
Intent recognition is a key component of any task-oriented conversational system. The intent recognizer can be used first to classify the user?s utterance into one of several predefined classes (intents) that help to understand the user?s current goal. T... ver más
Revista: Applied Sciences

 
Mihai Alexandru Niculescu, Stefan Ruseti and Mihai Dascalu    
Significant progress has been achieved in text generation due to recent developments in neural architectures; nevertheless, this task remains challenging, especially for low-resource languages. This study is centered on developing a model for abstractive... ver más
Revista: Algorithms

 
Dmitry Namiot,Andrey Akimov,Mariia Nekraplonna,Oleg Pokusaev     Pág. 44 - 49
This article deals with one model for analyzing urban mobility. Traditionally, the time domain is used in the analysis of movements. This is due to both traditional models of scheduling analysis and the classical approach to representing transport proble... ver más