Inicio  /  Applied Sciences  /  Vol: 13 Par: 11 (2023)  /  Artículo
ARTÍCULO
TITULO

Multi-Level Attention-Based Categorical Emotion Recognition Using Modulation-Filtered Cochleagram

Zhichao Peng    
Wenhua He    
Yongwei Li    
Yegang Du and Jianwu Dang    

Resumen

Speech emotion recognition is a critical component for achieving natural human?robot interaction. The modulation-filtered cochleagram is a feature based on auditory modulation perception, which contains multi-dimensional spectral?temporal modulation representation. In this study, we propose an emotion recognition framework that utilizes a multi-level attention network to extract high-level emotional feature representations from the modulation-filtered cochleagram. Our approach utilizes channel-level attention and spatial-level attention modules to generate emotional saliency maps of channel and spatial feature representations, capturing significant emotional channel and feature space from the 3D convolution feature maps, respectively. Furthermore, we employ a temporal-level attention module to capture significant emotional regions from the concatenated feature sequence of the emotional saliency maps. Our experiments on the Interactive Emotional Dyadic Motion Capture (IEMOCAP) dataset demonstrate that the modulation-filtered cochleagram significantly improves the prediction performance of categorical emotion compared to other evaluated features. Moreover, our emotion recognition framework achieves comparable unweighted accuracy of 71% in categorical emotion recognition by comparing with several existing approaches. In summary, our study demonstrates the effectiveness of the modulation-filtered cochleagram in speech emotion recognition, and our proposed multi-level attention framework provides a promising direction for future research in this field.

 Artículos similares

       
 
Qing Liu, Jianjun Hao and Yijun Guo    
The high cost of acquiring training data in the field of emotion recognition based on electroencephalogram (EEG) is a problem, making it difficult to establish a high-precision model from EEG signals for emotion recognition tasks. Given the outstanding p... ver más
Revista: Algorithms

 
Sakib Shahriar, Noora Al Roken and Imran Zualkernan    
The automatic classification of poems into various categories, such as by author or era, is an interesting problem. However, most current work categorizing Arabic poems into eras or emotions has utilized traditional feature engineering and machine learni... ver más
Revista: Computers

 
Michelle P. Banawan, Jinnie Shin, Tracy Arner, Renu Balyan, Walter L. Leite and Danielle S. McNamara    
Academic discourse communities and learning circles are characterized by collaboration, sharing commonalities in terms of social interactions and language. The discourse of these communities is composed of jargon, common terminologies, and similarities i... ver más
Revista: Computers

 
Ismail Shahin, Ali Bou Nassif, Rameena Thomas and Shibani Hamsa    
Modern developments in machine learning methodology have produced effective approaches to speech emotion recognition. The field of data mining is widely employed in numerous situations where it is possible to predict future outcomes by using the input se... ver más
Revista: Information

 
Yao Qin, Yiping Shi, Xinze Hao and Jin Liu    
Microblog is an important platform for mining public opinion, and it is of great value to conduct emotional analysis of microblog texts during the current epidemic. Aiming at the problem that most current emotional classification methods cannot effective... ver más
Revista: Information