ARTÍCULO
TITULO

RusNeuroPsych: Open Corpus for Study Relations between Author Demographic, Personality Traits, Lateral Preferences and Affect in Text

Tatiana Litvinova    
Ekarerina Ryzhkova    

Resumen

A text reflects a range of combinations of individual inter-acting characteristics of its author, both stable (gender, psychological traits, neuropsychological characteristics) and variable (feelings, emotions). It is obvious that it is not in isolation but in a combination that a variety of characteristics comes forth in a text. For example, according to some studies, men and women express their emotions in a text in different ways. It is obvious, though that there are other characteristics that influence the way one chooses to express his/her emotions. Studies of these ways are critical multidisciplinary problems that call for text corpora providing relevant metadata. The paper is devoted to the description of a manually collected corpus of texts (letters to a friend and narratives about pictures from Thematic apperception test, i.e. informal writing describing emotions and opinions) in the Russian language RusNeuroPsych, containing metalabelling in the form of information about their authors (gender, age, psychological testing scores, brain laterality preferences). To the best of our knowledge, this is a unique corpus in terms of breadth of metadata about the authors. The corpus is freely available on RusProfiling Lab webpage. The collection and processing of the material to design the corpus, its composition and structure are considered. The possibilities of the application of RusNeuroPsych corpus in different domains of knowledge are analyzed.

 Artículos similares

       
 
Diego Garat and Dina Wonsever    
In order to provide open access to data of public interest, it is often necessary to perform several data curation processes. In some cases, such as biological databases, curation involves quality control to ensure reliable experimental support for biolo... ver más
Revista: Information

 
Jeong-Uk Bang, Seung Yun, Seung-Hi Kim, Mu-Yeol Choi, Min-Kyu Lee, Yeo-Jeong Kim, Dong-Hyun Kim, Jun Park, Young-Jik Lee and Sang-Hun Kim    
This paper introduces a large-scale spontaneous speech corpus of Korean, named KsponSpeech. This corpus contains 969 h of general open-domain dialog utterances, spoken by about 2000 native Korean speakers in a clean environment. All data were constructed... ver más
Revista: Applied Sciences

 
Daniela Barreiro Claro, Marlo Souza, Clarissa Castellã Xavier and Leandro Oliveira    
The number of documents published on the Web in languages other than English grows every year. As a consequence, the need to extract useful information from different languages increases, highlighting the importance of research into Open Information Extr... ver más
Revista: Information

 
Akerke Akanova,Nazira Ospanova,Yevgeniya Kukharenko,Gulmira Abildinova     Pág. 26 - 32
The issue of semantic text analysis occupies a special place in computational linguistics. Researchers in this field have an increased interest in developing an algorithm that will improve the quality of text corpus processing and probabilistic determina... ver más

 
A.V. Glazkova     Pág. 97 - 103
The tasks of computer linguistics and machine learning related to natural language processing (NLP) often require the use of text corpora. Text corpora are specially prepared collection of documents equipped with text markup containing morphological, syn... ver más