Inicio  /  Applied Sciences  /  Vol: 10 Par: 12 (2020)  /  Artículo
ARTÍCULO
TITULO

A Polarity Capturing Sphere for Word to Vector Representation

Sandra Rizkallah    
Amir F. Atiya and Samir Shaheen    

Resumen

Embedding words from a dictionary as vectors in a space has become an active research field, due to its many uses in several natural language processing applications. Distances between the vectors should reflect the relatedness between the corresponding words. The problem with existing word embedding methods is that they often fail to distinguish between synonymous, antonymous, and unrelated word pairs. Meanwhile, polarity detection is crucial for applications such as sentiment analysis. In this work we propose an embedding approach that is designed to capture the polarity issue. The approach is based on embedding the word vectors into a sphere, whereby the dot product between any vectors represents the similarity. Vectors corresponding to synonymous words would be close to each other on the sphere, while a word and its antonym would lie at opposite poles of the sphere. The approach used to design the vectors is a simple relaxation algorithm. The proposed word embedding is successful in distinguishing between synonyms, antonyms, and unrelated word pairs. It achieves results that are better than those of some of the state-of-the-art techniques and competes well with the others.

 Artículos similares

       
 
Eduardo Cibrián, Jose María Álvarez-Rodríguez, Roy Mendieta and Juan Llorens    
The use of different techniques and tools is a common practice to cover all stages in the development life-cycle of systems generating a significant number of work products. These artefacts are frequently encoded using diverse formats, and often require ... ver más
Revista: Applied Sciences

 
Xuyang Wang, Yajun Du, Danroujing Chen, Xianyong Li, Xiaoliang Chen, Yongquan Fan, Chunzhi Xie, Yanli Li and Jia Liu    
Domain-generalized few-shot text classification (DG-FSTC) is a new setting for few-shot text classification (FSTC). In DG-FSTC, the model is meta-trained on a multi-domain dataset, and meta-tested on unseen datasets with different domains. However, previ... ver más
Revista: Applied Sciences

 
Huaqing Cheng, Shengquan Liu, Weiwei Sun and Qi Sun    
Topic models can extract consistent themes from large corpora for research purposes. In recent years, the combination of pretrained language models and neural topic models has gained attention among scholars. However, this approach has some drawbacks: in... ver más
Revista: Applied Sciences

 
Yao Qin, Yiping Shi, Xinze Hao and Jin Liu    
Microblog is an important platform for mining public opinion, and it is of great value to conduct emotional analysis of microblog texts during the current epidemic. Aiming at the problem that most current emotional classification methods cannot effective... ver más
Revista: Information

 
Musarat Karim, Malik Muhammad Saad Missen, Muhammad Umer, Saima Sadiq, Abdullah Mohamed and Imran Ashraf    
Citation creates a link between citing and the cited author, and the frequency of citation has been regarded as the basic element to measure the impact of research and knowledge-based achievements. Citation frequency has been widely used to calculate the... ver más
Revista: Applied Sciences