Controllable Accented Text-to-Speech Synthesis With Fine and Coarse-Grained Intensity Rendering

R Liu, B Sisman, G Gao, H Li - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org
Accented text-to-speech (TTS) synthesis seeks to generate speech with an accent (L2) as a
variant of the standard version (L1), which is challenging as L2 is different from L1 in terms …

Towards zero-shot multi-speaker multi-accent text-to-speech synthesis

M Zhang, X Zhou, Z Wu, H Li - IEEE Signal Processing Letters, 2023 - ieeexplore.ieee.org
This letter presents a framework towards multi-accent neural text-to-speech synthesis for
zero-shot multi-speaker, which employs an encoder-decoder architecture and an accent …

Convert and speak: Zero-shot accent conversion with minimum supervision

H Xue, X Peng, Y Lu - ACM Multimedia 2024, 2024 - openreview.net
Low resource of parallel data is the key challenge of accent conversion (AC) problem in
which both the pronunciation units and prosody pattern need to be converted. We propose a …

Convert and Speak: Zero-shot Accent Conversion with Minimum Supervision

Z Jia, H Xue, X Peng, Y Lu - Proceedings of the 32nd ACM International …, 2024 - dl.acm.org
Low resource of parallel data is the key challenge of accent conversion (AC) problem in
which both the pronunciation units and prosody pattern need to be converted. We propose a …

DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech

J Melechovsky, A Mehrish, B Sisman… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advancements in Text-to-Speech (TTS) systems have enabled the generation of
natural and expressive speech from textual input. Accented TTS aims to enhance user …

[PDF][PDF] Neural Speech Synthesis for Austrian Dialects with Standard German Grapheme-to-Phoneme Conversion and Dialect Embeddings

L Gutscher, M Pucher, V Garcia - Proc. 2nd Annual Meeting of the …, 2023 - researchgate.net
For languages where extensive audio data and text transcriptions are available, text-to-
speech (TTS) systems have showcased the ability to generate speech that closely …

Improving Pronunciation and Accent Conversion through Knowledge Distillation And Synthetic Ground-Truth from Native TTS

TN Nguyen, S Akti, NQ Pham, A Waibel - arXiv preprint arXiv:2410.14997, 2024 - arxiv.org
Previous approaches on accent conversion (AC) mainly aimed at making non-native speech
sound more native while maintaining the original content and speaker identity. However …

Diffusion-Based Method with TTS Guidance for Foreign Accent Conversion

Q Bai, S Wang, Z Liu, M Zhang, W Rao… - 2024 IEEE 14th …, 2024 - ieeexplore.ieee.org
Accent conversion (AC) aims to alter the accent of spoken language while preserving the
original content and speaker characteristics. While any accent can be selected as a target …

Non-autoregressive real-time Accent Conversion model with voice cloning

V Nechaev, S Kosyakov - arXiv preprint arXiv:2405.13162, 2024 - arxiv.org
Currently, the development of Foreign Accent Conversion (FAC) models utilizes deep neural
network architectures, as well as ensembles of neural networks for speech recognition and …

Transfer the linguistic representations from TTS to accent conversion with non-parallel data

X Chen, J Pei, L Xue, M Zhang - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
Accent conversion aims to convert the accent of a source speech to a target accent,
meanwhile preserving the speaker's identity. This paper introduces a novel non …