Redirigiendo al acceso original de articulo en 18 segundos...
Inicio  /  Applied Sciences  /  Vol: 13 Par: 2 (2023)  /  Artículo
ARTÍCULO
TITULO

Chinese Lip-Reading Research Based on ShuffleNet and CBAM

Yixian Fu    
Yuanyao Lu and Ran Ni    

Resumen

Lip reading has attracted increasing attention recently due to advances in deep learning. However, most research targets English datasets. The study of Chinese lip-reading technology is still in its initial stage. Firstly, in this paper, we expand the naturally distributed word-level Chinese dataset called ?Databox? previously built by our laboratory. Secondly, the current state-of-the-art model consists of a residual network and a temporal convolutional network. The residual network leads to excessive computational cost and is not suitable for the on-device applications. In the new model, the residual network is replaced with ShuffleNet, which is an extremely computation-efficient Convolutional Neural Network (CNN) architecture. Thirdly, to help the network focus on the most useful information, we insert a simple but effective attention module called Convolutional Block Attention Module (CBAM) into the ShuffleNet. In our experiment, we compare several model architectures and find that our model achieves a comparable accuracy to the residual network (3.5 GFLOPs) under the computational budget of 1.01 GFLOPs.

 Artículos similares

       
 
Hexin Lu, Xiaodong Zhu, Jingwei Cui and Haifeng Jiang    
The process of iris recognition can result in a decline in recognition performance when the resolution of the iris images is insufficient. In this study, a super-resolution model for iris images, namely SwinGIris, which combines the Swin Transformer and ... ver más
Revista: Algorithms

 
Zongshun Wang, Ce Li, Jialin Ma, Zhiqiang Feng and Limei Xiao    
In this study, we introduce a novel framework for the semantic segmentation of point clouds in autonomous driving scenarios, termed PVI-Net. This framework uniquely integrates three different data perspectives?point clouds, voxels, and distance maps?exec... ver más
Revista: Information

 
Sakorn Mekruksavanich and Anuchit Jitpattanakul    
Smartphones have become ubiquitous, allowing people to perform various tasks anytime and anywhere. As technology continues to advance, smartphones can now sense and connect to networks, providing context-awareness for different applications. Many individ... ver más
Revista: Information

 
Feifei He, Qinjuan Wan, Yongqiang Wang, Jiang Wu, Xiaoqi Zhang and Yu Feng    
Accurately predicting hydrological runoff is crucial for water resource allocation and power station scheduling. However, there is no perfect model that can accurately predict future runoff. In this paper, a daily runoff prediction method with a seasonal... ver más
Revista: Water

 
Lisa Pierotti, Cristiano Fidani, Gianluca Facca and Fabrizio Gherardi    
Variations in the CO2 dissolved in water springs have long been observed near the epicenters of moderate and strong earthquakes. In a recent work focused on data collected during the 2017?2021 period from a monitoring site in the Northern Apennines, Ital... ver más
Revista: Water