Inicio  /  Applied Sciences  /  Vol: 10 Par: 19 (2020)  /  Artículo
ARTÍCULO
TITULO

MAKEDONKA: Applied Deep Learning Model for Text-to-Speech Synthesis in Macedonian Language

Kostadin Mishev    
Aleksandra Karovska Ristovska    
Dimitar Trajanov    
Tome Eftimov and Monika Simjanoska    

Resumen

This paper presents MAKEDONKA, the first open-source Macedonian language synthesizer that is based on the Deep Learning approach. The paper provides an overview of the numerous attempts to achieve a human-like reproducible speech, which has unfortunately shown to be unsuccessful due to the work invisibility and lack of integration examples with real software tools. The recent advances in Machine Learning, the Deep Learning-based methodologies, provide novel methods for feature engineering that allow for smooth transitions in the synthesized speech, making it sound natural and human-like. This paper presents a methodology for end-to-end speech synthesis that is based on a fully-convolutional sequence-to-sequence acoustic model with a position-augmented attention mechanism?Deep Voice 3. Our model directly synthesizes Macedonian speech from characters. We created a dataset that contains approximately 20 h of speech from a native Macedonian female speaker, and we use it to train the text-to-speech (TTS) model. The achieved MOS score of 3.93 makes our model appropriate for application in any kind of software that needs text-to-speech service in the Macedonian language. Our TTS platform is publicly available for use and ready for integration.

 Artículos similares

       
 
Tao Tang, Yuting Cui, Rui Feng and Deliang Xiang    
With the development of deep learning in the field of computer vision, convolutional neural network models and attention mechanisms have been widely applied in SAR image target recognition. The improvement of convolutional neural network attention in exi... ver más
Revista: Information

 
Youngkwang Kim, Woochan Kim, Jungwoo Yoon, Sangkug Chung and Daegeun Kim    
This paper presents a practical contamination detection system for camera lenses using image analysis with deep learning. The proposed system can detect contamination in camera digital images through contamination learning utilizing deep learning, and it... ver más
Revista: Information

 
Jiarui Xia and Yongshou Dai    
Ground roll noise suppression is a crucial step in processing deep pre-stack seismic data. Recently, supervised deep learning methods have gained popularity in this field due to their ability to adaptively learn and extract powerful features. However, th... ver más
Revista: Applied Sciences

 
Rossana Caroni, Monica Pinardi, Gary Free, Daniela Stroppiana, Lorenzo Parigi, Giulio Tellina, Mariano Bresciani, Clément Albergel and Claudia Giardino    
A study was carried out to investigate the effects of wildfires on lake water quality using a source dataset of 2024 lakes worldwide, covering different lake types and ecological settings. Satellite-derived datasets (Lakes_cci and Fire_cci) were used and... ver más
Revista: Applied Sciences

 
Bo Zhao, Qifan Zhang, Yangchun Liu, Yongzhi Cui and Baixue Zhou    
In response to the need for precision and intelligence in the assessment of transplanting machine operation quality, this study addresses challenges such as low accuracy and efficiency associated with manual observation and random field sampling for the ... ver más
Revista: Applied Sciences