|
|
|
Hyeon-Kyu Noh and Hong-June Park
A convolutional neural network (CNN) transducer decoder was proposed to reduce the decoding time of an end-to-end automatic speech recognition (ASR) system while maintaining accuracy. The CNN of 177 k parameters and a kernel size of 6 generates the proba...
ver más
|
|
|
|
|
|
|
Bohdan Petryshyn, Serhii Postupaiev, Soufiane Ben Bari and Armantas Ostreika
The development of autonomous driving models through reinforcement learning has gained significant traction. However, developing obstacle avoidance systems remains a challenge. Specifically, optimising path completion times while navigating obstacles is ...
ver más
|
|
|
|
|
|
|
Hellena Hempe, Alexander Bigalke and Mattias Paul Heinrich
Background: Degenerative spinal pathologies are highly prevalent among the elderly population. Timely diagnosis of osteoporotic fractures and other degenerative deformities enables proactive measures to mitigate the risk of severe back pain and disabilit...
ver más
|
|
|
|
|
|
|
Mohammed Saïd Kasttet, Abdelouahid Lyhyaoui, Douae Zbakh, Adil Aramja and Abderazzek Kachkari
Recently, artificial intelligence and data science have witnessed dramatic progress and rapid growth, especially Automatic Speech Recognition (ASR) technology based on Hidden Markov Models (HMMs) and Deep Neural Networks (DNNs). Consequently, new end-to-...
ver más
|
|
|
|
|
|
|
Vijeta Sharma, Manjari Gupta, Ajai Kumar and Deepti Mishra
The video camera is essential for reliable activity monitoring, and a robust analysis helps in efficient interpretation. The systematic assessment of classroom activity through videos can help understand engagement levels from the perspective of both stu...
ver más
|
|
|
|
|
|
|
Georgios Karantaidis and Constantine Kotropoulos
The detection of computer-generated (CG) multimedia content has become of utmost importance due to the advances in digital image processing and computer graphics. Realistic CG images could be used for fraudulent purposes due to the deceiving recognition ...
ver más
|
|
|
|
|
|
|
Can Li, Hua Sun, Changhong Wang, Sheng Chen, Xi Liu, Yi Zhang, Na Ren and Deyu Tong
In order to safeguard image copyrights, zero-watermarking technology extracts robust features and generates watermarks without altering the original image. Traditional zero-watermarking methods rely on handcrafted feature descriptors to enhance their per...
ver más
|
|
|
|
|
|
|
Xi Lyu, Yushan Sun, Lifeng Wang, Jiehui Tan and Liwen Zhang
This study aims to solve the problems of sparse reward, single policy, and poor environmental adaptability in the local motion planning task of autonomous underwater vehicles (AUVs). We propose a two-layer deep deterministic policy gradient algorithm-bas...
ver más
|
|
|
|
|
|
|
Ting Guo, Nurmemet Yolwas and Wushour Slamu
Recently, the performance of end-to-end speech recognition has been further improved based on the proposed Conformer framework, which has also been widely used in the field of speech recognition. However, the Conformer model is mostly applied to very wid...
ver más
|
|
|
|
|
|
|
Ogbaje Andrew, Armando Apan, Dev Raj Paudyal and Kithsiri Perera
The accuracy of most SAR-based flood classification and segmentation derived from semi-automated algorithms is often limited due to complicated radar backscatter. However, deep learning techniques, now widely applied in image classifications, have demons...
ver más
|
|
|
|