Hierarchical sequence to sequence voice conversion with limited data

Voice transformer network: Sequence-to-sequence voice conversion using transformer with text-to-speech pretraining

WC Huang, T Hayashi, YC Wu, H Kameoka… - arXiv preprint arXiv …, 2019 - arxiv.org

We introduce a novel sequence-to-sequence (seq2seq) voice conversion (VC) model based
on the Transformer architecture with text-to-speech (TTS) pretraining. Seq2seq VC models …

被引用次数：115 相关文章所有 8 个版本

[PDF] ieee.org

Pretraining techniques for sequence-to-sequence voice conversion

WC Huang, T Hayashi, YC Wu… - … /ACM Transactions on …, 2021 - ieeexplore.ieee.org

Sequence-to-sequence (seq2seq) voice conversion (VC) models are attractive owing to
their ability to convert prosody. Nonetheless, without sufficient data, seq2seq VC models can …

被引用次数：48 相关文章所有 6 个版本

[PDF] googleapis.com

Generating expressive speech audio from text data

S Gururani, K Gupta, D Shah, Z Shakeri… - US Patent …, 2022 - Google Patents

(57) ABSTRACT A system for use in video game development to generate expressive
speech audio comprises a user interface config ured to receive user-input text data and a …

被引用次数：12 相关文章所有 4 个版本

" I'm Having Trouble Understanding You Right Now": A Multi-DimensionalEvaluation of the Intelligibility of Dysphonic Speech

M Moore - 2020 - search.proquest.com

Individuals with voice disorders experience challenges communicating daily. These
challenges lead to a significant decrease in the quality of life for individuals with dysphonia …

被引用次数：2 相关文章所有 2 个版本

Generating speech in the voice of a player of a video game

Z Shakeri, J Pinto, K Gupta, M Sardari… - US Patent …, 2023 - Google Patents

A computer-implemented method of generating speech audio in a video game is provided.
The method includes inputting, into a synthesizer module, input data that represents speech …

被引用次数：2 相关文章所有 2 个版本

[PDF] googleapis.com

Speaker conversion for video games

K Gupta, D Shah, Z Shakeri, J Pinto, M Sardari… - US Patent …, 2023 - Google Patents

Jia, YE, et al.“Transfer Learning from Speaker Verification to MultispeakerText-to-Speech
Synthesis” arXiv preprint arXiv: 1806. 04558 (2018), Retrieved from https://arxiv …

Generating expressive speech audio from text data

S Gururani, K Gupta, D Shah, Z Shakeri… - US Patent …, 2024 - Google Patents

A system for use in video game development to generate expressive speech audio
comprises a user interface configured to receive user-input text data and a user selection of …

高级搜索

QQ 群