Inicio  /  Applied Sciences  /  Vol: 13 Par: 11 (2023)  /  Artículo
ARTÍCULO
TITULO

End-to-End Mispronunciation Detection and Diagnosis Using Transfer Learning

Linkai Peng    
Yingming Gao    
Rian Bao    
Ya Li and Jinsong Zhang    

Resumen

As an indispensable module of computer-aided pronunciation training (CAPT) systems, mispronunciation detection and diagnosis (MDD) techniques have attracted a lot of attention from academia and industry over the past decade. To train robust MDD models, this technique requires massive human-annotated speech recordings which are usually expensive and even hard to acquire. In this study, we propose to use transfer learning to tackle the problem of data scarcity from two aspects. First, from audio modality, we explore the use of the pretrained model wav2vec2.0 for MDD tasks by learning robust general acoustic representation. Second, from text modality, we explore transferring prior texts into MDD by learning associations between acoustic and textual modalities. We propose textual modulation gates that assign more importance to the relevant text information while suppressing irrelevant text information. Moreover, given the transcriptions, we propose an extra contrastive loss to reduce the difference of learning objectives between the phoneme recognition and MDD tasks. Conducting experiments on the L2-Arctic dataset showed that our wav2vec2.0 based models outperformed the conventional methods. The proposed textual modulation gate and contrastive loss further improved the F1-score by more than 2.88% and our best model achieved an F1-score of 61.75%.

 Artículos similares

       
 
Shaoping Wang, Yongjian Ding, Mudassar Iqbal     Pág. 1 - 19
The northwest arid region (NAR) of China, located in a cold region, has been experiencing extreme weather and runoff events for years. Summer (from June to August) is the main season for forming runoff in this region. Summer runoff is contributed by glac... ver más
Revista: Water

 
Bambang Setiawan     Pág. 75 - 81
Empirical evidence suggests that the percentage of coarse fraction content on soil has an influence on the soil optimum moisture content (OMC) and soil maximum dry density (MDD). This phenomenon is used as a basis to examine the characteristics of Aceh?s... ver más