Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method

Jianfeng Wu

Yongzhu Hua

Shengying Yang

Hongshuai Qin and Huibin Qin

Resumen

This paper presents a new deep neural network (DNN)-based speech enhancement algorithm by integrating the distilled knowledge from the traditional statistical-based method. Unlike the other DNN-based methods, which usually train many different models on the same data and then average their predictions, or use a large number of noise types to enlarge the simulated noisy speech, the proposed method does not train a whole ensemble of models and does not require a mass of simulated noisy speech. It first trains a discriminator network and a generator network simultaneously using the adversarial learning method. Then, the discriminator network and generator network are re-trained by distilling knowledge from the statistical method, which is inspired by the knowledge distillation in a neural network. Finally, the generator network is fine-tuned using real noisy speech. Experiments on CHiME4 data sets demonstrate that the proposed method achieves a more robust performance than the compared DNN-based method in terms of perceptual speech quality.

Palabras claves

speech enhancement - deep neural network - generative adversarial network - distill knowledge

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 9 Parte: 16 (2019)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

DOI

https://doi.org/10.3390/app9163396

Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method

Revistas destacadas