|
|
|
Kadria Ezzine, Joseph Di Martino and Mondher Frikha
We present an any-to-one voice conversion (VC) system, using an autoregressive model and LPCNet vocoder, aimed at enhancing the converted speech in terms of naturalness, intelligibility, and speaker similarity. As the name implies, non-parallel any-to-on...
ver más
|
|
|
|
|
|
Ismail Shahin, Ali Bou Nassif, Rameena Thomas and Shibani Hamsa
Modern developments in machine learning methodology have produced effective approaches to speech emotion recognition. The field of data mining is widely employed in numerous situations where it is possible to predict future outcomes by using the input se...
ver más
|
|
|
|
|
|
Musab T. S. Al-Kaltakchi, Ahmad Saeed Mohammad and Wai Lok Woo
Speech separation is a well-known problem, especially when there is only one sound mixture available. Estimating the Ideal Binary Mask (IBM) is one solution to this problem. Recent research has focused on the supervised classification approach. The chall...
ver más
|
|
|
|
|
|
Nikita Andriyanov
The problem solved in the article is connected with the increase in the efficiency of phraseological radio exchange message recognition, which sometimes takes place in conditions of increased tension for the pilot. For high-quality recognition, signal pr...
ver más
|
|
|
|
|
|
Juan Carlos Atenco, Juan Carlos Moreno and Juan Manuel Ramirez
In this work we present a bimodal multitask network for audiovisual biometric recognition. The proposed network performs the fusion of features extracted from face and speech data through a weighted sum to jointly optimize the contribution of each modali...
ver más
|
|
|