Voice transformer network: Sequence-to-sequence voice conversion using transformer with text-to-speech pretraining

WC Huang, T Hayashi, YC Wu, H Kameoka… - arXiv preprint arXiv …, 2019 - arxiv.org
We introduce a novel sequence-to-sequence (seq2seq) voice conversion (VC) model based
on the Transformer architecture with text-to-speech (TTS) pretraining. Seq2seq VC models …

Pretraining techniques for sequence-to-sequence voice conversion

WC Huang, T Hayashi, YC Wu… - … /ACM Transactions on …, 2021 - ieeexplore.ieee.org
Sequence-to-sequence (seq2seq) voice conversion (VC) models are attractive owing to
their ability to convert prosody. Nonetheless, without sufficient data, seq2seq VC models can …

Generating expressive speech audio from text data

S Gururani, K Gupta, D Shah, Z Shakeri… - US Patent …, 2022 - Google Patents
(57) ABSTRACT A system for use in video game development to generate expressive
speech audio comprises a user interface config ured to receive user-input text data and a …

" I'm Having Trouble Understanding You Right Now": A Multi-DimensionalEvaluation of the Intelligibility of Dysphonic Speech

M Moore - 2020 - search.proquest.com
Individuals with voice disorders experience challenges communicating daily. These
challenges lead to a significant decrease in the quality of life for individuals with dysphonia …

Generating speech in the voice of a player of a video game

Z Shakeri, J Pinto, K Gupta, M Sardari… - US Patent …, 2023 - Google Patents
A computer-implemented method of generating speech audio in a video game is provided.
The method includes inputting, into a synthesizer module, input data that represents speech …

Speaker conversion for video games

K Gupta, D Shah, Z Shakeri, J Pinto, M Sardari… - US Patent …, 2023 - Google Patents
Jia, YE, et al.“Transfer Learning from Speaker Verification to MultispeakerText-to-Speech
Synthesis” arXiv preprint arXiv: 1806. 04558 (2018), Retrieved from https://arxiv …

Generating expressive speech audio from text data

S Gururani, K Gupta, D Shah, Z Shakeri… - US Patent …, 2024 - Google Patents
A system for use in video game development to generate expressive speech audio
comprises a user interface configured to receive user-input text data and a user selection of …