REVISTA
Information

TODAS

Redirigiendo al acceso original de articulo en 24 segundos...

Inicio / Information / Vol: 14 Par: 4 (2023) / Artículo

ARTÍCULO

TITULO

Four Million Segments and Counting: Building an English-Croatian Parallel Corpus through Crowdsourcing Using a Novel Gamification-Based Platform

Rafal Jaworski

Sanja Seljan and Ivan Dunder

Resumen

Parallel corpora have been widely used in the fields of natural language processing and translation as they provide crucial multilingual information. They are used to train machine translation systems, compile dictionaries, or generate inter-language word embeddings. There are many corpora available publicly; however, support for some languages is still limited. In this paper, the authors present a framework for collecting, organizing, and storing corpora. The solution was originally designed to obtain data for less-resourced languages, but it proved to work very well for the collection of high-value domain-specific corpora. The scenario is based on the collective work of a group of people who are motivated by the means of gamification. The rules of the game motivate the participants to submit large resources, and a peer-review process ensures quality. More than four million translated segments have been collected so far.

Palabras claves

parallel corpus - data acquisition - gamification - crowdsourcing - machine translation - natural language processing

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 14 Parte: 4 (2023)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Water
Aerospace
Applied Sciences

DOI

https://doi.org/10.3390/info14040226

Artículos similares

Unraveling Microblog Sentiment Dynamics: A Twitter Public Attitudes Analysis towards COVID-19 Cases and Deaths

Acceso

Paraskevas Koukaras, Dimitrios Rousidis and Christos Tjortjis

The identification and analysis of sentiment polarity in microblog data has drawn increased attention. Researchers and practitioners attempt to extract knowledge by evaluating public sentiment in response to global events. This study aimed to evaluate pu... ver más

Revista: Informatics

A Comparative Study of Deep Learning Models for Dental Segmentation in Panoramic Radiograph

Acceso

Élisson da Silva Rocha and Patricia Takako Endo

Introduction: Dental segmentation in panoramic radiograph has become very relevant in dentistry, since it allows health professionals to carry out their assessments more clearly and helps them to define the best possible treatment plan for their patients... ver más

Revista: Applied Sciences

Large-Scale Printed Chinese Character Recognition for ID Cards Using Deep Learning and Few Samples Transfer Learning

Acceso

Yi-Quan Li, Hao-Sen Chang and Daw-Tung Lin

In the field of computer vision, large-scale image classification tasks are both important and highly challenging. With the ongoing advances in deep learning and optical character recognition (OCR) technologies, neural networks designed to perform large-... ver más

Revista: Applied Sciences

A Novel Approach to Detect COVID-19: Enhanced Deep Learning Models with Convolutional Neural Networks

Acceso

Awf A. Ramadhan and Muhammet Baykara

The novel coronavirus (COVID-19) is a contagious viral disease that has rapidly spread worldwide since December 2019, causing the disruption of life and heavy economic losses. Since the beginning of the virus outbreak, a polymerase chain reaction has bee... ver más

Revista: Applied Sciences

Assessment of Groundwater Flow Dynamics Using MODFLOW in Shallow Aquifer System of Mahanadi Delta (East Coast), India

Acceso

Ajit Kumar Behera, Rudra Mohan Pradhan, Sudhir Kumar, Govind Joseph Chakrapani and Pankaj Kumar

Despite being a biodiversity hotspot, the Mahanadi delta is facing groundwater salinization as one of the main environmental threats in the recent past. Hence, this study attempts to understand the dynamics of groundwater and its sustainable management o... ver más

Revista: Water

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas