|
|
|
Hermilo Santiago-Benito , Diana-Margarita Córdova-Esparza , Noé-Alejandro Castro-Sánchez , Teresa García-Ramirez , Julio-Alejandro Romero-González and Juan Terven
This paper introduces a novel method for collecting and translating texts from the Mixtec to the Spanish language. The method comprises four primary steps. First, we collected a Mixtec?Spanish corpus that includes 4568 sentences from educational and reli...
ver más
|
|
|
|
|
|
|
Dezhi Cao, Yue Zhao and Licheng Wu
The construction of pronunciation dictionaries relies on high-quality and extensive training data in data-driven way. However, the manual annotation of corpus for this purpose is both costly and time consuming, especially for low-resource languages that ...
ver más
|
|
|
|
|
|
|
Mikel Penagarikano, Amparo Varona, Germán Bordel and Luis Javier Rodriguez-Fuentes
In this paper, a semisupervised speech data extraction method is presented and applied to create a new dataset designed for the development of fully bilingual Automatic Speech Recognition (ASR) systems for Basque and Spanish. The dataset is drawn from an...
ver más
|
|
|
|
|
|
|
Atnafu Lambebo Tonja, Olga Kolesnikova, Alexander Gelbukh and Grigori Sidorov
Despite the many proposals to solve the neural machine translation (NMT) problem of low-resource languages, it continues to be difficult. The issue becomes even more complicated when few resources cover only a single domain. In this paper, we discuss the...
ver más
|
|
|
|
|
|
|
Ayiguli Halike, Aishan Wumaier and Tuergen Yibulayin
Although low-resource relation extraction is vital in knowledge construction and characterization, more research is needed on the generalization of unknown relation types. To fill the gap in the study of low-resource (Uyghur) relation extraction methods,...
ver más
|
|
|
|
|
|
|
Yonghua Wen, Junjun Guo, Zhiqiang Yu and Zhengtao Yu
Parallel sentences play a crucial role in various NLP tasks, particularly for cross-lingual tasks such as machine translation. However, due to the time-consuming and laborious nature of manual construction, many low-resource languages still suffer from a...
ver más
|
|
|
|
|
|
|
Séamus Lankford, Haithem Afli and Andy Way
In this study, a human evaluation is carried out on how hyperparameter settings impact the quality of Transformer-based Neural Machine Translation (NMT) for the low-resourced English?Irish pair. SentencePiece models using both Byte Pair Encoding (BPE) an...
ver más
|
|
|
|
|
|
|
Konlakorn Wongpatikaseree, Sattaya Singkul, Narit Hnoohom and Sumeth Yuenyong
Language resources are the main factor in speech-emotion-recognition (SER)-based deep learning models. Thai is a low-resource language that has a smaller data size than high-resource languages such as German. This paper describes the framework of using a...
ver más
|
|
|
|
|
|
|
Valery Solovyev and Vladimir Ivanov
In a great deal of theoretical and applied cognitive and neurophysiological research, it is essential to have more vocabularies with concreteness/abstractness ratings. Since creating such dictionaries by interviewing informants is labor-intensive, consid...
ver más
|
|
|
|
|
|
|
Mihai Alexandru Niculescu, Stefan Ruseti and Mihai Dascalu
Significant progress has been achieved in text generation due to recent developments in neural architectures; nevertheless, this task remains challenging, especially for low-resource languages. This study is centered on developing a model for abstractive...
ver más
|
|
|
|