Probabilistic kernels for improved text-to-speech alignment in long audio tracks

Y Teytaut, A Roebel - Proceedings of Interspeech 2021, 2021 - hal.science

Phoneme-to-audio alignment is the task of synchronizing voice recordings and their related
phonetic transcripts. In this work, we introduce a new system to forced phonetic alignment …

被引用次数：25 相关文章所有 9 个版本

[PDF] hal.science

Joint phoneme alignment and text-informed speech separation on highly corrupted speech

K Schulze-Forster, CSJ Doire… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

Speech separation quality can be improved by exploiting textual information. However, this
usually requires text-to-speech alignment at phoneme level. Classical alignment methods …

被引用次数：30 相关文章所有 6 个版本

[PDF] mdpi.com

Semisupervised Speech Data Extraction from Basque Parliament Sessions and Validation on Fully Bilingual Basque–Spanish ASR

M Penagarikano, A Varona, G Bordel… - Applied Sciences, 2023 - mdpi.com

In this paper, a semisupervised speech data extraction method is presented and applied to
create a new dataset designed for the development of fully bilingual Automatic Speech …

被引用次数：2 相关文章所有 6 个版本

[PDF] mdpi.com

A linear memory CTC-based algorithm for text-to-voice alignment of very long audio recordings

G Doras, Y Teytaut, A Roebel - Applied Sciences, 2023 - mdpi.com

Synchronisation of a voice recording with the corresponding text is a common task in speech
and music processing, and is used in many practical applications (automatic subtitling …

被引用次数：6 相关文章所有 7 个版本

[PDF] arxiv.org

Iterative pseudo-forced alignment by acoustic ctc loss for self-supervised asr domain adaptation

F López, J Luque - arXiv preprint arXiv:2210.15226, 2022 - arxiv.org

High-quality data labeling from specific domains is costly and human time-consuming. In this
work, we propose a self-supervised domain adaptation method, based upon an iterative …

被引用次数：5 相关文章所有 7 个版本

[PDF] ieee.org

Sub-sync: Automatic synchronization of subtitles in the broadcasting of true live programs in spanish

I González-Carrasco, L Puente, B Ruiz-Mezcua… - IEEE …, 2019 - ieeexplore.ieee.org

Individuals with sensory impairment (hearing or visual) encounter serious communication
barriers within society and the world around them. These barriers hinder the communication …

被引用次数：13 相关文章所有 5 个版本

[PDF] mdpi.com

A Bilingual Basque–Spanish Dataset of Parliamentary Sessions for the Development and Evaluation of Speech Technology

A Varona, M Penagarikano, G Bordel… - Applied Sciences, 2024 - mdpi.com

The development of speech technology requires large amounts of data to estimate the
underlying models. Even when relying on large multilingual pre-trained models, some …

被引用次数：1 相关文章所有 3 个版本

[PDF] researchgate.net

[PDF][PDF] Semisupervised training of a fully bilingual ASR system for Basque and Spanish

M Penagarikano, A Varona, G Bordel… - Proceedings of the …, 2022 - researchgate.net

Automatic speech recognition (ASR) of speech signals with code-switching (an abrupt
language change common in bilingual communities) typically requires spoken language …

被引用次数：3 相关文章所有 7 个版本

Research on Chinese audio and text alignment algorithm based on AIC-FCM and Doc2Vec

K Chen, J Huang, Y Cui, W Ren - ACM Transactions on Asian and Low …, 2023 - dl.acm.org

''Audiobook” is a multimedia-based reading technology that has emerged in recent years.
Realizing the alignment of e-book text and book audio is the most important part of its …

被引用次数：4 相关文章所有 2 个版本

[PDF] researchgate.net

[PDF][PDF] Text-To-Speech Synthesizer for English, Hindi and Marathi Spoken Signals‖

GD Ramteke, RJ Ramteke - at British Journal of Applied Science …, 2016 - researchgate.net

The paper proposes a model of Text-To-Speech (TTS) engine for Marathi, Hindi and English
languages. The characters and their representation are analyzed and synthesized with the …

被引用次数：6 相关文章所有 2 个版本

高级搜索

QQ 群