Redirigiendo al acceso original de articulo en 21 segundos...
Inicio  /  Information  /  Vol: 15 Par: 2 (2024)  /  Artículo
ARTÍCULO
TITULO

Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction

Yusuf Brima    
Ulf Krumnack    
Simone Pika and Gunther Heidemann    

Resumen

Self-supervised learning (SSL) has emerged as a promising paradigm for learning flexible speech representations from unlabeled data. By designing pretext tasks that exploit statistical regularities, SSL models can capture useful representations that are transferable to downstream tasks. Barlow Twins (BTs) is an SSL technique inspired by theories of redundancy reduction in human perception. In downstream tasks, BTs representations accelerate learning and transfer this learning across applications. This study applies BTs to speech data and evaluates the obtained representations on several downstream tasks, showing the applicability of the approach. However, limitations exist in disentangling key explanatory factors, with redundancy reduction and invariance alone being insufficient for factorization of learned latents into modular, compact, and informative codes. Our ablation study isolated gains from invariance constraints, but the gains were context-dependent. Overall, this work substantiates the potential of Barlow Twins for sample-efficient speech encoding. However, challenges remain in achieving fully hierarchical representations. The analysis methodology and insights presented in this paper pave a path for extensions incorporating further inductive priors and perceptual principles to further enhance the BTs self-supervision framework.

 Artículos similares

       
 
Timotej Jagric and Alja? Herman    
This paper presents a broad study on the application of the BERT (Bidirectional Encoder Representations from Transformers) model for multiclass text classification, specifically focusing on categorizing business descriptions into 1 of 13 distinct industr... ver más
Revista: Information

 
Paolo Fantozzi, Valentina Rotondi, Matteo Rizzolli, Paola Dalla Torre and Maurizio Naldi    
Moral features are essential components of TV series, helping the audience to engage with the story, exploring themes beyond sheer entertainment, reflecting current social issues, and leaving a long-lasting impact on the viewers. Their presence shows thr... ver más
Revista: Information

 
Rogério Duarte, Ângela Lacerda Nobre, Fernando Pimentel and Marc Jacquinet    
Accreditation bodies call for curriculum development processes that are open to all stakeholders, reflecting viewpoints of students, industry, university faculty, and society. However, communication difficulties between faculty and non-faculty groups lea... ver más

 
Jinghua Groppe, Sven Groppe, Daniel Senf and Ralf Möller    
Given a set of software programs, each being labeled either as vulnerable or benign, deep learning technology can be used to automatically build a software vulnerability detector. A challenge in this context is that there are countless equivalent ways to... ver más
Revista: Information

 
Shifeng Chen, Jialin Wang and Ketai He    
The popularization of the internet and the widespread use of smartphones have led to a rapid growth in the number of social media users. While information technology has brought convenience to people, it has also given rise to cyberbullying, which has a ... ver más
Revista: Information