|
|
|
Jialin Zhang, Mairidan Wushouer, Gulanbaier Tuerhong and Hanfang Wang
Emotional speech synthesis is an important branch of human?computer interaction technology that aims to generate emotionally expressive and comprehensible speech based on the input text. With the rapid development of speech synthesis technology based on ...
ver más
|
|
|
|
|
|
|
Juan Zuluaga-Gomez, Amrutha Prasad, Iuliia Nigmatulina, Petr Motlicek and Matthias Kleinert
In this paper we propose a novel virtual simulation-pilot engine for speeding up air traffic controller (ATCo) training by integrating different state-of-the-art artificial intelligence (AI)-based tools. The virtual simulation-pilot engine receives spoke...
ver más
|
|
|
|
|
|
|
Jiachen Zhang, Guoqing Tu, Shubo Liu and Zhaohui Cai
The rapid development of speech synthesis technology has significantly improved the naturalness and human-likeness of synthetic speech. As the technical barriers for speech synthesis are rapidly lowering, the number of illegal activities such as fraud an...
ver más
|
|
|
|
|
|
|
Navruz Madibragimov
Pág. 79 - 86
Today, computational linguistics of the Tajik language is at the origin of its development. In order to develop this area, the author of this article is developing a project for the formalization of inflections of the Tajik language for computer morpholo...
ver más
|
|
|
|
|
|
|
Kadria Ezzine, Joseph Di Martino and Mondher Frikha
We present an any-to-one voice conversion (VC) system, using an autoregressive model and LPCNet vocoder, aimed at enhancing the converted speech in terms of naturalness, intelligibility, and speaker similarity. As the name implies, non-parallel any-to-on...
ver más
|
|
|
|
|
|
|
Víctor García, Inma Hernáez and Eva Navas
In this paper, we describe the implementation and evaluation of Text to Speech synthesizers based on neural networks for Spanish and Basque. Several voices were built, all of them using a limited number of data. The system applies Tacotron 2 to compute m...
ver más
|
|
|
|
|
|
|
Jerry Gibson and Hoontaek Oh
Speech coding is an essential technology for digital cellular communications, voice over IP, and video conferencing systems. For more than 25 years, the main approach to speech coding for these applications has been block-based analysis-by-synthesis line...
ver más
|
|
|
|
|
|
|
Noé Tits, Kevin El Haddad and Thierry Dutoit
In this paper, we study the controllability of an Expressive TTS system trained on a dataset for a continuous control. The dataset is the Blizzard 2013 dataset based on audiobooks read by a female speaker containing a great variability in styles and expr...
ver más
|
|
|
|
|
|
|
Bulat Nutfullin,Eugene Ilyushin
Pág. 14 - 20
Speech is a specific feature of human and his advantage over other species within evolution. Sound diarization is a process of sound separation, taking into account belonging to the speaker. Before the advent of deep learning and the ...
ver más
|
|
|
|
|
|
|
Sung Jun Cheon, Joun Yeop Lee, Byoung Jin Choi, Hyeonseung Lee and Nam Soo Kim
End-to-end neural network-based speech synthesis techniques have been developed to represent and synthesize speech in various prosodic style. Although the end-to-end techniques enable the transfer of a style with a single vector of style representation, ...
ver más
|
|
|
|