|
|
|
Felipe Coelho de Abreu Pinna, Victor Takashi Hayashi, João Carlos Néto, Rosangela de Fátima Pereira Marquesone, Maísa Cristina Duarte, Rodrigo Suzuki Okada and Wilson Vicente Ruggiero
Complex and long interactions (e.g., a change of topic during a conversation) justify the use of dialog systems to develop task-oriented chatbots and intelligent virtual assistants. The development of dialog systems requires considerable effort and takes...
ver más
|
|
|
|
|
|
|
Jiahao Fan and Weijun Pan
In recent years, automatic speech recognition (ASR) technology has improved significantly. However, the training process for an ASR model is complex, involving large amounts of data and a large number of algorithms. The task of training a new model for a...
ver más
|
|
|
|
|
|
|
Fenfang Li, Zhengzhang Zhao, Li Wang and Han Deng
Sentence Boundary Disambiguation (SBD) is crucial for building datasets for tasks such as machine translation, syntactic analysis, and semantic analysis. Currently, most automatic sentence segmentation in Tibetan adopts the methods of rule-based and stat...
ver más
|
|
|
|
|
|
|
Aaradh Nepal and Francesco Perono Cacciafoco
During the Bronze Age, the inhabitants of regions of Crete, mainland Greece, and Cyprus inscribed their languages using, among other scripts, a writing system called Linear A. These symbols, mainly characterized by combinations of lines, have, since thei...
ver más
|
|
|
|
|
|
|
Mondher Bouazizi, Chuheng Zheng, Siyuan Yang and Tomoaki Ohtsuki
A growing focus among scientists has been on researching the techniques of automatic detection of dementia that can be applied to the speech samples of individuals with dementia. Leveraging the rapid advancements in Deep Learning (DL) and Natural Languag...
ver más
|
|
|
|
|
|
|
Melania Nitu and Mihai Dascalu
Machine-generated content reshapes the landscape of digital information; hence, ensuring the authenticity of texts within digital libraries has become a paramount concern. This work introduces a corpus of approximately 60 k Romanian documents, including ...
ver más
|
|
|
|
|
|
|
Weiwei Yuan, Wanxia Yang, Liang He, Tingwei Zhang, Yan Hao, Jing Lu and Wenbo Yan
The extraction of entities and relationships is a crucial task in the field of natural language processing (NLP). However, existing models for this task often rely heavily on a substantial amount of labeled data, which not only consumes time and labor bu...
ver más
|
|
|
|
|
|
|
Hermilo Santiago-Benito , Diana-Margarita Córdova-Esparza , Noé-Alejandro Castro-Sánchez , Teresa García-Ramirez , Julio-Alejandro Romero-González and Juan Terven
This paper introduces a novel method for collecting and translating texts from the Mixtec to the Spanish language. The method comprises four primary steps. First, we collected a Mixtec?Spanish corpus that includes 4568 sentences from educational and reli...
ver más
|
|
|
|
|
|
|
Rafal Jaworski, Sanja Seljan and Ivan Dunder
Parallel corpora have been widely used in the fields of natural language processing and translation as they provide crucial multilingual information. They are used to train machine translation systems, compile dictionaries, or generate inter-language wor...
ver más
|
|
|
|
|
|
|
Jesus Insuasti, Felipe Roa and Carlos Mario Zapata-Jaramillo
Pre-conceptual schemas are a straightforward way to represent knowledge using controlled language regardless of context. Despite the benefits of using pre-conceptual schemas by humans, they present challenges when interpreted by computers. We propose an ...
ver más
|
|
|
|