REVISTA
Algorithms

TODAS

Redirigiendo al acceso original de articulo en 15 segundos...

Inicio / Algorithms / Vol: 16 Par: 7 (2023) / Artículo

ARTÍCULO

TITULO

Audio Anti-Spoofing Based on Audio Feature Fusion

Jiachen Zhang

Guoqing Tu

Shubo Liu and Zhaohui Cai

Resumen

The rapid development of speech synthesis technology has significantly improved the naturalness and human-likeness of synthetic speech. As the technical barriers for speech synthesis are rapidly lowering, the number of illegal activities such as fraud and extortion is increasing, posing a significant threat to authentication systems, such as automatic speaker verification. This paper proposes an end-to-end speech synthesis detection model based on audio feature fusion in response to the constantly evolving synthesis techniques and to improve the accuracy of detecting synthetic speech. The model uses a pre-trained wav2vec2 model to extract features from raw waveforms and utilizes an audio feature fusion module for back-end classification. The audio feature fusion module aims to improve the model accuracy by adequately utilizing the audio features extracted from the front end and fusing the information from timeframes and feature dimensions. Data augmentation techniques are also used to enhance the performance generalization of the model. The model is trained on the training and development sets of the logical access (LA) dataset of the ASVspoof 2019 Challenge, an international standard, and is tested on the logical access (LA) and deep-fake (DF) evaluation datasets of the ASVspoof 2021 Challenge. The equal error rate (EER) on ASVspoof 2021 LA and ASVspoof 2021 DF are 1.18% and 2.62%, respectively, achieving the best results on the DF dataset.

Palabras claves

deep learning - wav2vec 2.0 - automatic speaker verification - deep-fake detection - ASVspoof Challenge

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 16 Parte: 7 (2023)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Water
Algorithms
Applied Sciences

DOI

https://doi.org/10.3390/a16070317

Artículos similares

Design and Implementation of Nursing-Secure-Care System with mmWave Radar by YOLO-v4 Computing Methods

Acceso

Jih-Ching Chiu, Guan-Yi Lee, Chih-Yang Hsieh and Qing-You Lin

In computer vision and image processing, the shift from traditional cameras to emerging sensing tools, such as gesture recognition and object detection, addresses privacy concerns. This study navigates the Integrated Sensing and Communication (ISAC) era,... ver más

Revista: Applied System Innovation

CapGAN: Text-to-Image Synthesis Using Capsule GANs

Acceso

Maryam Omar, Hafeez Ur Rehman, Omar Bin Samin, Moutaz Alazab, Gianfranco Politano and Alfredo Benso

Text-to-image synthesis is one of the most critical and challenging problems of generative modeling. It is of substantial importance in the area of automatic learning, especially for image creation, modification, analysis and optimization. A number of wo... ver más

Revista: Information

Semi-Supervised Learning for Robust Emotional Speech Synthesis with Limited Data

Acceso

Jialin Zhang, Mairidan Wushouer, Gulanbaier Tuerhong and Hanfang Wang

Emotional speech synthesis is an important branch of human?computer interaction technology that aims to generate emotionally expressive and comprehensible speech based on the input text. With the rapid development of speech synthesis technology based on ... ver más

Revista: Applied Sciences

End-to-End 3D Liver CT Image Synthesis from Vasculature Using a Multi-Task Conditional Generative Adversarial Network

Acceso

Qianmu Xiao and Liang Zhao

Acquiring relevant, high-quality, and heterogeneous medical images is essential in various types of automated analysis, used for a variety of downstream data augmentation tasks. However, a large number of real image samples are expensive to obtain, espec... ver más

Revista: Applied Sciences

Multicriteria Decision Making in Tourism Industry Based on Visualization of Aggregation Operators

Acceso

Sergey Sakulin and Alexander Alfimtsev

The modern tourist industry is characterized by an abundance of applied multicriteria decision-making tasks. Several researchers have demonstrated that such tasks can be effectively resolved using aggregation operators based on fuzzy integrals and fuzzy ... ver más

Revista: Applied System Innovation

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas