118 Artículos

A Generative Artificial Intelligence Using Multilingual Large Language Models for ChatGPT Applications

Acceso

en línea

Nguyen Trung Tuan, Philip Moore, Dat Ha Vu Thanh and Hai Van Pham

ChatGPT plays significant roles in the third decade of the 21st Century. Smart cities applications can be integrated with ChatGPT in various fields. This research proposes an approach for developing large language models using generative artificial intel... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 14 Num: 0 Par: 7 Año: 2024

Tibetan Sentence Boundaries Automatic Disambiguation Based on Bidirectional Encoder Representations from Transformers on Byte Pair Encoding Word Cutting Method

Acceso

en línea

Fenfang Li, Zhengzhang Zhao, Li Wang and Han Deng

Sentence Boundary Disambiguation (SBD) is crucial for building datasets for tasks such as machine translation, syntactic analysis, and semantic analysis. Currently, most automatic sentence segmentation in Tibetan adopts the methods of rule-based and stat... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 14 Num: 0 Par: 7 Año: 2024

Analyzing Indo-European Language Similarities Using Document Vectors

Acceso

en línea

Samuel R. Schrader and Eren Gultepe

The evaluation of similarities between natural languages often relies on prior knowledge of the languages being studied. We describe three methods for building phylogenetic trees and clustering languages without the use of language-specific information. ... ver más

Revista: Informatics Formato: Electrónico

Tabla de contenido: Vol: 10 Num: 0 Par: 4 Año: 2023

The Impact of Data Pre-Processing on Hate Speech Detection in a Mix of English and Hindi?English (Code-Mixed) Tweets

Acceso

en línea

Khalil Al-Hussaeni, Mohamed Sameer and Ioannis Karamitsos

Due to the increasing reliance on social network platforms in recent years, hate speech has risen significantly among online users. Government and social media platforms face the challenging responsibility of controlling, detecting, and removing massivel... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 19 Año: 2023

Near-Optimal Active Learning for Multilingual Grapheme-to-Phoneme Conversion

Acceso

en línea

Dezhi Cao, Yue Zhao and Licheng Wu

The construction of pronunciation dictionaries relies on high-quality and extensive training data in data-driven way. However, the manual annotation of corpus for this purpose is both costly and time consuming, especially for low-resource languages that ... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 16 Año: 2023

Semisupervised Speech Data Extraction from Basque Parliament Sessions and Validation on Fully Bilingual Basque?Spanish ASR

Acceso

en línea

Mikel Penagarikano, Amparo Varona, Germán Bordel and Luis Javier Rodriguez-Fuentes

In this paper, a semisupervised speech data extraction method is presented and applied to create a new dataset designed for the development of fully bilingual Automatic Speech Recognition (ASR) systems for Basque and Spanish. The dataset is drawn from an... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 14 Año: 2023

A Study on Generating Webtoons Using Multilingual Text-to-Image Models

Acceso

en línea

Kyungho Yu, Hyoungju Kim, Jeongin Kim, Chanjun Chun and Pankoo Kim

Text-to-image technology enables computers to create images from text by simulating the human process of forming mental images. GAN-based text-to-image technology involves extracting features from input text; subsequently, they are combined with noise an... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 12 Año: 2023

Multilingual Speech Recognition for Turkic Languages

Acceso

en línea

Saida Mussakhojayeva, Kaisar Dauletbek, Rustem Yeshpanov and Huseyin Atakan Varol

The primary aim of this study was to contribute to the development of multilingual automatic speech recognition for lower-resourced Turkic languages. Ten languages?Azerbaijani, Bashkir, Chuvash, Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Uyghur, and Uzbek?we... ver más

Revista: Information Formato: Electrónico

Tabla de contenido: Vol: 14 Num: 0 Par: 2 Año: 2023

Using Multiple Monolingual Models for Efficiently Embedding Korean and English Conversational Sentences

Acceso

en línea

Youngki Park and Youhyun Shin

This paper presents a novel approach for finding the most semantically similar conversational sentences in Korean and English. Our method involves training separate embedding models for each language and using a hybrid algorithm that selects the appropri... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 9 Año: 2023

On Isotropy of Multimodal Embeddings

Acceso

en línea

Kirill Tyshchuk, Polina Karpikova, Andrew Spiridonov, Anastasiia Prutianova, Anton Razzhigaev and Alexander Panchenko

Embeddings, i.e., vector representations of objects, such as texts, images, or graphs, play a key role in deep learning methodologies nowadays. Prior research has shown the importance of analyzing the isotropy of textual embeddings for transformer-based ... ver más

Revista: Information Formato: Electrónico

Tabla de contenido: Vol: 14 Num: 0 Par: 7 Año: 2023

« Anterior Página: 1 de 7 Siguiente »