Inicio  /  Future Internet  /  Vol: 12 Par: 12 (2020)  /  Artículo
ARTÍCULO
TITULO

Keeping Models Consistent between Pretraining and Translation for Low-Resource Neural Machine Translation

Wenbo Zhang    
Xiao Li    
Yating Yang    
Rui Dong and Gongxu Luo    

Resumen

Recently, the pretraining of models has been successfully applied to unsupervised and semi-supervised neural machine translation. A cross-lingual language model uses a pretrained masked language model to initialize the encoder and decoder of the translation model, which greatly improves the translation quality. However, because of a mismatch in the number of layers, the pretrained model can only initialize part of the decoder?s parameters. In this paper, we use a layer-wise coordination transformer and a consistent pretraining translation transformer instead of a vanilla transformer as the translation model. The former has only an encoder, and the latter has an encoder and a decoder, but the encoder and decoder have exactly the same parameters. Both models can guarantee that all parameters in the translation model can be initialized by the pretrained model. Experiments on the Chinese?English and English?German datasets show that compared with the vanilla transformer baseline, our models achieve better performance with fewer parameters when the parallel corpus is small.

 Artículos similares

       
 
Shaomei Li, Guangzhi Yin, Jingzhen Ma, Bowei Wen and Zhao Zhou    
Relief shading is the primary method for effectively representing three-dimensional terrain on a two-dimensional plane. Despite its expressiveness, manual relief shading is difficult and time-consuming. In contrast, although analytical relief shading is ... ver más

 
Xue Yan, Jinliang Zhang, Yang Li, Yan Zhang and Long Sun    
Although a large number of meandering rivers have been studied by means of modern sedimentation, instrument detection, numerical simulation, flume experiment and field outcrop, and a lot of achievements have been made, there are not many fine anatomical ... ver más
Revista: Water

 
Hyunjung Kim, Seongyong Kim and Kiyun Yu    
Automatic floor plan analysis has gained increased attention in recent research. However, numerous studies related to this area are mainly experiments conducted with a simplified floor plan dataset with low resolution and a small housing scale due to the... ver más

 
Sagar Kora Venu and Sridhar Ravula    
Medical image datasets are usually imbalanced due to the high costs of obtaining the data and time-consuming annotations. Training a deep neural network model on such datasets to accurately classify the medical condition does not yield the desired result... ver más
Revista: Future Internet

 
Qifei Zhou, Changqing Zhu and Na Ren    
How to keep the fidelity of the digital elevation model (DEM) data is a crucial problem in the current watermarking research, as the watermarked DEM data need to preserve their accuracy. We proposed a zero watermarking method for the triangulated irregul... ver más