Emotional voice conversion: Theory, databases and ESD

A Triantafyllopoulos, BW Schuller… - Proceedings of the …, 2023 - ieeexplore.ieee.org

Speech is the fundamental mode of human communication, and its synthesis has long been
a core priority in human–computer interaction research. In recent years, machines have …

被引用次数：27 相关文章所有 7 个版本

[PDF] tandfonline.com

Automatic speech recognition using limited vocabulary: A survey

JLKE Fendji, DCM Tala, BO Yenke… - Applied Artificial …, 2022 - Taylor & Francis

ABSTRACT Automatic Speech Recognition (ASR) is an active field of research due to its
large number of applications and the proliferation of interfaces or computing devices that …

被引用次数：36 相关文章所有 6 个版本

[PDF] arxiv.org

Textless speech emotion conversion using discrete and decomposed representations

F Kreuk, A Polyak, J Copet, E Kharitonov… - arXiv preprint arXiv …, 2021 - arxiv.org

Speech emotion conversion is the task of modifying the perceived emotion of a speech
utterance while preserving the lexical content and speaker identity. In this study, we cast the …

被引用次数：51 相关文章所有 6 个版本

[PDF] ieee.org

Emotion intensity and its control for emotional voice conversion

K Zhou, B Sisman, R Rana… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Emotional voice conversion (EVC) seeks to convert the emotional state of an utterance while
preserving the linguistic content and speaker identity. In EVC, emotions are usually treated …

被引用次数：42 相关文章所有 7 个版本

[PDF] ieee.org

Speech synthesis with mixed emotions

K Zhou, B Sisman, R Rana… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Emotional speech synthesis aims to synthesize human voices with various emotional effects.
The current studies are mostly focused on imitating an averaged style belonging to a specific …

被引用次数：29 相关文章所有 6 个版本

[PDF] arxiv.org

Probing speech emotion recognition transformers for linguistic knowledge

A Triantafyllopoulos, J Wagner, H Wierstorf… - arXiv preprint arXiv …, 2022 - arxiv.org

Large, pre-trained neural networks consisting of self-attention layers (transformers) have
recently achieved state-of-the-art results on several speech emotion recognition (SER) …

被引用次数：22 相关文章所有 9 个版本

[PDF] arxiv.org

Limited data emotional voice conversion leveraging text-to-speech: Two-stage sequence-to-sequence training

K Zhou, B Sisman, H Li - arXiv preprint arXiv:2103.16809, 2021 - arxiv.org

Emotional voice conversion (EVC) aims to change the emotional state of an utterance while
preserving the linguistic content and speaker identity. In this paper, we propose a novel 2 …

被引用次数：30 相关文章所有 8 个版本

[PDF] thecvf.com

Neural Emotion Director: Speech-preserving semantic control of facial expressions in" in-the-wild" videos

FP Papantoniou, PP Filntisis… - Proceedings of the …, 2022 - openaccess.thecvf.com

In this paper, we introduce a novel deep learning method for photo-realistic manipulation of
the emotional state of actors in" in-the-wild" videos. The proposed method is based on a …

被引用次数：20 相关文章所有 7 个版本

[PDF] arxiv.org

Grad-stylespeech: Any-speaker adaptive text-to-speech synthesis with diffusion models

M Kang, D Min, SJ Hwang - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org

There has been a significant progress in Text-To-Speech (TTS) synthesis technology in
recent years, thanks to the advancement in neural generative modeling. However, existing …

被引用次数：14 相关文章所有 4 个版本

[PDF] arxiv.org

Pmvc: Data augmentation-based prosody modeling for expressive voice conversion

Y Deng, H Tang, X Zhang, J Wang, N Cheng… - Proceedings of the 31st …, 2023 - dl.acm.org

Voice conversion as the style transfer task applied to speech, refers to converting one
person's speech into a new speech that sounds like another person's. Up to now, there has …

被引用次数：6 相关文章所有 4 个版本

高级搜索

QQ 群