|
|
|
Mohammed Saïd Kasttet, Abdelouahid Lyhyaoui, Douae Zbakh, Adil Aramja and Abderazzek Kachkari
Recently, artificial intelligence and data science have witnessed dramatic progress and rapid growth, especially Automatic Speech Recognition (ASR) technology based on Hidden Markov Models (HMMs) and Deep Neural Networks (DNNs). Consequently, new end-to-...
ver más
|
|
|
|
|
|
|
Saida Mussakhojayeva, Kaisar Dauletbek, Rustem Yeshpanov and Huseyin Atakan Varol
The primary aim of this study was to contribute to the development of multilingual automatic speech recognition for lower-resourced Turkic languages. Ten languages?Azerbaijani, Bashkir, Chuvash, Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Uyghur, and Uzbek?we...
ver más
|
|
|
|
|
|
|
Juan Carlos Atenco, Juan Carlos Moreno and Juan Manuel Ramirez
In this work we present a bimodal multitask network for audiovisual biometric recognition. The proposed network performs the fusion of features extracted from face and speech data through a weighted sum to jointly optimize the contribution of each modali...
ver más
|
|
|
|
|
|
|
Matthias Kleinert, Oliver Ohneiser, Hartmut Helmke, Shruthi Shetty, Heiko Ehr, Mathias Maier, Susanne Schacht and Hanno Wiese
The information air traffic controllers (ATCos) communicate via radio telephony is valuable for digital assistants to provide additional safety. Yet, ATCos have to enter this information manually. Assistant-based speech recognition (ABSR) has proven to b...
ver más
|
|
|
|
|
|
|
Eduardo Medeiros, Leonel Corado, Luís Rato, Paulo Quaresma and Pedro Salgueiro
Automatic speech recognition (ASR), commonly known as speech-to-text, is the process of transcribing audio recordings into text, i.e., transforming speech into the respective sequence of words. This paper presents a deep learning ASR system optimization ...
ver más
|
|
|
|
|
|
|
Ting Guo, Nurmemet Yolwas and Wushour Slamu
Recently, the performance of end-to-end speech recognition has been further improved based on the proposed Conformer framework, which has also been widely used in the field of speech recognition. However, the Conformer model is mostly applied to very wid...
ver más
|
|
|
|
|
|
|
Mikel Penagarikano, Amparo Varona, Germán Bordel and Luis Javier Rodriguez-Fuentes
In this paper, a semisupervised speech data extraction method is presented and applied to create a new dataset designed for the development of fully bilingual Automatic Speech Recognition (ASR) systems for Basque and Spanish. The dataset is drawn from an...
ver más
|
|
|
|
|
|
|
Sakshi Dua, Sethuraman Sambath Kumar, Yasser Albagory, Rajakumar Ramalingam, Ankur Dumka, Rajesh Singh, Mamoon Rashid, Anita Gehlot, Sultan S. Alshamrani and Ahmed Saeed AlGhamdi
Deep learning-based machine learning models have shown significant results in speech recognition and numerous vision-related tasks. The performance of the present speech-to-text model relies upon the hyperparameters used in this research work. In this re...
ver más
|
|
|
|
|
|
|
Moa Lee and Joon-Hyuk Chang
Speech recognition for intelligent robots seems to suffer from performance degradation due to ego-noise. The ego-noise is caused by the motors, fans, and mechanical parts inside the intelligent robots especially when the robot moves or shakes its body. T...
ver más
|
|
|
|
|
|
|
Mohammad Ali Humayun, Ibrahim A. Hameed, Syed Muslim Shah, Sohaib Hassan Khan, Irfan Zafar, Saad Bin Ahmed and Junaid Shuja
Automatic Speech Recognition, (ASR) has achieved the best results for English, with end-to-end neural network based supervised models. These supervised models need huge amounts of labeled speech data for good generalization, which can be quite a challeng...
ver más
|
|
|
|