Multi-Scale Features for Transformer Model to Improve the Performance of Sound Event Detection

Soo-Jong Kim and Yong-Joo Chung

Resumen

To alleviate the problem of performance degradation due to the varied sound durations of competing classes in sound event detection, we propose a method that utilizes multi-scale features for sound event detection. We employed a feature-pyramid component in a deep neural network architecture based on the Transformer encoder that is used to efficiently model the time correlation of sound signals because of its superiority over conventional recurrent neural networks, as demonstrated in recent studies. We used layers of convolutional neural networks to produce two-dimensional acoustic features that are input into the Transformer encoders. The outputs of the Transformer encoders at different levels of the network are combined to obtain the multi-scale features to feed the fully connected feed-forward neural network, which acts as the final classification layer. The proposed method is motivated by the idea that multi-scale features make the network more robust against the dynamic duration of the sound signals depending on their classes. We also applied the proposed method to a mean-teacher model, based on the Transformer encoder, to demonstrate its effectiveness on a large set of unlabeled data. We conducted experiments using the DCASE 2019 Task 4 dataset to evaluate the performance of the proposed method. The experimental results show that the proposed architecture outperforms the baseline network without multi-scale features.

Palabras claves

sound event detection - transformer encoder - feature-pyramid - convolutional neural network - mean-teacher model - attention model

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 12 Parte: 5 (2022)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Journal of Marine Science and Engineering
Applied Sciences
Algorithms

DOI

https://doi.org/10.3390/app12052626

Artículos similares

GEA-MSNet: A Novel Model for Segmenting Remote Sensing Images of Lakes Based on the Global Efficient Attention Module and Multi-Scale Feature Extraction

Acceso

Qiyan Li, Zhi Weng, Zhiqiang Zheng and Lixin Wang

The decrease in lake area has garnered significant attention within the global ecological community, prompting extensive research in remote sensing and computer vision to accurately segment lake areas from satellite images. However, existing image segmen... ver más

Revista: Applied Sciences

Research on Efficient Feature Generation and Spatial Aggregation for Remote Sensing Semantic Segmentation

Acceso

Ruoyang Li, Shuping Xiong, Yinchao Che, Lei Shi, Xinming Ma and Lei Xi

Semantic segmentation algorithms leveraging deep convolutional neural networks often encounter challenges due to their extensive parameters, high computational complexity, and slow execution. To address these issues, we introduce a semantic segmentation ... ver más

Revista: Algorithms

Pedestrian Detection Based on Feature Enhancement in Complex Scenes

Acceso

Jiao Su, Yi An, Jialin Wu and Kai Zhang

Pedestrian detection has always been a difficult and hot spot in computer vision research. At the same time, pedestrian detection technology plays an important role in many applications, such as intelligent transportation and security monitoring. In comp... ver más

Revista: Algorithms

Lightweight Underwater Object Detection Algorithm for Embedded Deployment Using Higher-Order Information and Image Enhancement

Acceso

Changhong Liu, Jiawen Wen, Jinshan Huang, Weiren Lin, Bochun Wu, Ning Xie and Tao Zou

Underwater object detection is crucial in marine exploration, presenting a challenging problem in computer vision due to factors like light attenuation, scattering, and background interference. Existing underwater object detection models face challenges ... ver más

Revista: Journal of Marine Science and Engineering

Small-Scale Foreign Object Debris Detection Using Deep Learning and Dual Light Modes

Acceso

Yiming Mo, Lei Wang, Wenqing Hong, Congzhen Chu, Peigen Li and Haiting Xia

The intrusion of foreign objects on airport runways during aircraft takeoff and landing poses a significant safety threat to air transportation. Small-scale Foreign Object Debris (FOD) cannot be ruled out on time by traditional manual inspection, and the... ver más

Revista: Applied Sciences

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas