Resumen
The paper is devoted to the testing results of the sentiment analysis algorithms. They were applied to downloaded from the social network VKontakte comments. Comments were written on posts in public communities related to the discussion of the news agenda of the city with separation into districts. The authors collected the dataset with text data from 36 public groups. The ultimate goal of the authors is an interactive map that reflects the index of social well-being of citizens. In this regard, at the first stage, the study was focused on thematic publics present in the selected social network with reference to geolocation. The authors propose the data collection technique based on the analysis of the tempo-rhythm of non-verbal communication of community members. Based on the collected data, the testing study of several machine learning algorithms was carried out in order to identify the most optimal one. The analysis of deep learning methods remained outside the scope of this experiment, but such models seem redundant for solving current problems. The authors also describe reflections on the topic of text vectorization methods, since the correct vectorization model can improve performance and sentiment analysis. In general, the paper presents statistics on the success of the algorithms (logistic regression, random forest, support vector machine), and also describes methods for assessing quality. The implementation of the resulting web service in beta mode is available on GitHub.