Redirigiendo al acceso original de articulo en 15 segundos...
Inicio  /  Information  /  Vol: 14 Par: 5 (2023)  /  Artículo
ARTÍCULO
TITULO

Chinese?Vietnamese Pseudo-Parallel Sentences Extraction Based on Image Information Fusion

Yonghua Wen    
Junjun Guo    
Zhiqiang Yu and Zhengtao Yu    

Resumen

Parallel sentences play a crucial role in various NLP tasks, particularly for cross-lingual tasks such as machine translation. However, due to the time-consuming and laborious nature of manual construction, many low-resource languages still suffer from a lack of large-scale parallel data. The objective of pseudo-parallel sentence extraction is to automatically identify sentence pairs in different languages that convey similar meanings. Earlier methods heavily relied on parallel data, which is unsuitable for low-resource scenarios. The current mainstream research direction is to use transfer learning or unsupervised learning based on cross-lingual word embeddings and multilingual pre-trained models; however, these methods are ineffective for languages with substantial differences. To address this issue, we propose a sentence extraction method that leverages image information fusion to extract Chinese?Vietnamese pseudo-parallel sentences from collections of bilingual texts. Our method first employs an adaptive image and text feature fusion strategy to efficiently extract the bilingual parallel sentence pair, and then, a multimodal fusion method is presented to balance the information between the image and text modalities. The experiments on multiple benchmarks show that our method achieves promising results compared to a competitive baseline by infusing additional external image information.

 Artículos similares

       
 
Ayiguli Halike, Aishan Wumaier and Tuergen Yibulayin    
Although low-resource relation extraction is vital in knowledge construction and characterization, more research is needed on the generalization of unknown relation types. To fill the gap in the study of low-resource (Uyghur) relation extraction methods,... ver más
Revista: Applied Sciences

 
Aye Aye Mar, Kiyoaki Shirai and Natthawut Kertkeidkachorn    
Aspect-based sentiment analysis (ABSA) is a process to extract an aspect of a product from a customer review and identify its polarity. Most previous studies of ABSA focused on explicit aspects, but implicit aspects have not yet been the subject of much ... ver más
Revista: Information

 
Qurat Ul Ain, Mohamed Amine Chatti, Komlan Gluck Charles Bakar, Shoeb Joarder and Rawaa Alatrash    
Knowledge graphs (KGs) are widely used in the education domain to offer learners a semantic representation of domain concepts from educational content and their relations, termed as educational knowledge graphs (EduKGs). Previous studies on EduKGs have i... ver más
Revista: Information

 
Youngki Park and Youhyun Shin    
This paper presents a novel approach for finding the most semantically similar conversational sentences in Korean and English. Our method involves training separate embedding models for each language and using a hybrid algorithm that selects the appropri... ver más
Revista: Applied Sciences

 
Yu Tang, Zhiqin He, Qinmu Wu, Xiao Wang and Yuhang Wang    
The scoliosis report is a diagnosis made by the clinician looking at X-ray images of the spine. However, with numerous images, writing the report can be time-consuming and error-prone. Therefore, this paper proposes an automatic generation model of the e... ver más
Revista: Applied Sciences