|
|
|
Liliya Demidova, Dmitry Zhukov, Elena Andrianova and Vladimir Kalinin
To solve the problem of text clustering according to semantic groups, we suggest using a model of a unified lexico-semantic bond between texts and a similarity matrix based on it. Using lexico-semantic analysis methods, we can create ?term?document? matr...
ver más
|
|
|
|
|
|
|
Jie Long, Zihan Li, Qi Xuan, Chenbo Fu, Songtao Peng and Yong Min
The opinion recognition for comments in Internet media is a new task in text analysis. It takes comment statements as the research object, by learning the opinion tendency in the original text with annotation, and then performing opinion tendency recogni...
ver más
|
|
|
|
|
|
|
A.N. Alpatov,K.S. Popov,A.N. Chesalin
Pág. 47 - 53
This paper investigates the problem of natural language processing using machine learning techniques, in particular, classification of unstructured heterogeneous text data sets. The paper presents a comparative analysis of some relevant and widely used m...
ver más
|
|
|
|
|
|
|
Anna Chizhik,Svetlana Melnikova,Victor Zakharov
Pág. 75 - 80
The paper is devoted to the testing results of the sentiment analysis algorithms. They were applied to downloaded from the social network VKontakte comments. Comments were written on posts in public communities related to the discussion of the news agend...
ver más
|
|
|
|
|
|
|
Aliya Jangabylova, Alexander Krassovitskiy, Rustam Mussabayev and Irina Ualiyeva
The documents similarity metric is a substantial tool applied in areas such as determining topic in relation to documents, plagiarism detection, or problems necessary to capture the semantic, syntactic, or structural similarity of texts. Evaluated result...
ver más
|
|
|
|
|
|
|
Jirí Krejcí and Jirí Cajthaml
The article deals with a comprehensive information system of the historic Vltava River valley. This system contains a number of resources, which are described. For old maps, which are the basis of the whole system, their georeferencing and potential prob...
ver más
|
|
|
|
|
|
|
Guizhe Song, Degen Huang and Zhifeng Xiao
Multilingual characteristics, lack of annotated data, and imbalanced sample distribution are the three main challenges for toxic comment analysis in a multilingual setting. This paper proposes a multilingual toxic text classifier which adopts a novel fus...
ver más
|
|
|
|
|
|
|
Shuai Dong, Wei Wang, Wensheng Li and Kun Zou
A 2D floor plan (FP) often contains structural, decorative, and functional elements and annotations. Vectorization of floor plans (VFP) is an object detection task that involves the localization and recognition of different structural primitives in 2D FP...
ver más
|
|
|
|
|
|
|
Maksym Lupei,Alexander Mitsa,Volodymyr Repariuk,Vasyl Sharkan
Pág. 30 - 36
The problem of development of an effective method for text authorship identification (on the material of publications of well-known Ukrainian journalists) is explored. Most existing methods require text preprocessing, which entails new costs when solving...
ver más
|
|
|
|
|
|
|
Anna V. Chizhik,Yulia A. Zherebtsova
Pág. 50 - 56
In this paper, we review the recent progress in developing intelligent conversational agents (or chatbots), its current architectures (rule-based, retrieval based and generative-based models) and discuss the main advantages and disadvantages of the appro...
ver más
|
|
|
|