Redirigiendo al acceso original de articulo en 19 segundos...
Inicio  /  Applied Sciences  /  Vol: 9 Par: 11 (2019)  /  Artículo
ARTÍCULO
TITULO

Disentangled Feature Learning for Noise-Invariant Speech Enhancement

Soo Hyun Bae    
Inkyu Choi and Nam Soo Kim    

Resumen

Most of the recently proposed deep learning-based speech enhancement techniques have focused on designing the neural network architectures as a black box. However, it is often beneficial to understand what kinds of hidden representations the model has learned. Since the real-world speech data are drawn from a generative process involving multiple entangled factors, disentangling the speech factor can encourage the trained model to result in better performance for speech enhancement. With the recent success in learning disentangled representation using neural networks, we explore a framework for disentangling speech and noise, which has not been exploited in the conventional speech enhancement algorithms. In this work, we propose a novel noise-invariant speech enhancement method which manipulates the latent features to distinguish between the speech and noise features in the intermediate layers using adversarial training scheme. To compare the performance of the proposed method with other conventional algorithms, we conducted experiments in both the matched and mismatched noise conditions using TIMIT and TSPspeech datasets. Experimental results show that our model successfully disentangles the speech and noise latent features. Consequently, the proposed model not only achieves better enhancement performance but also offers more robust noise-invariant property than the conventional speech enhancement techniques.

 Artículos similares

       
 
Chengkai Cai, Kenta Iwai and Takanobu Nishiura    
The development of distant-talk measurement systems has been attracting attention since they can be applied to many situations such as security and disaster relief. One such system that uses a device called a laser Doppler vibrometer (LDV) to acquire sou... ver más
Revista: Applied Sciences

 
Xintao Liang, Yuhang Li, Xiaomin Li, Yue Zhang and Youdong Ding    
Implementing single-channel speech enhancement under unknown noise conditions is a challenging problem. Most existing time-frequency domain methods are based on the amplitude spectrogram, and these methods often ignore the phase mismatch between noisy sp... ver más
Revista: Information

 
Vasundhara Shukla and Preety D. Swami    
This paper introduces a novel speech enhancement approach called dominant columns group orthogonalization of the sensing matrix (DCGOSM) in compressive sensing (CS). DCGOSM optimizes the sensing matrix using particle swarm optimization (PSO), ensuring se... ver más
Revista: Applied Sciences

 
Shuiyan Li, Rongzhi Qi and Shengnan Zhang    
Compared with English named entity recognition (NER), Chinese NER faces significant challenges due to the flexible, non-standard word formation and vague word boundaries, which cause a lot of boundary ambiguity and reduce the accuracy of entity identific... ver más
Revista: Applied Sciences

 
Abdullah Zaini Alsheibi, Kimon P. Valavanis, Asif Iqbal and Muhammad Naveed Aman    
With the advancement in voice-communication-based human?machine interface technology in smart home devices, the ability to decompose the received speech signal into a signal of interest and an interference component has emerged as a key requirement for t... ver más
Revista: Acoustics