Varying speaking styles with neural textto-speech

G Zellou, N Holliday - Frontiers in Computer Science, 2024 - frontiersin.org

This article reviews recent literature investigating speech variation in production and
comprehension during spoken language communication between humans and devices …

被引用次数：2 相关文章

[HTML] aip.org

[HTML][HTML] The clear speech intelligibility benefit for text-to-speech voices: Effects of speaking style and visual guise

NB Aoki, M Cohn, G Zellou - JASA Express Letters, 2022 - pubs.aip.org

This study examined how speaking style and guise influence the intelligibility of text-to-
speech (TTS) and naturally produced human voices. Results showed that TTS voices were …

被引用次数：19 相关文章所有 6 个版本

[PDF] ieee.org

Modeling of Rakugo speech and its limitations: Toward speech synthesis that entertains audiences

S Kato, Y Yasuda, X Wang, E Cooper, S Takaki… - IEEE …, 2020 - ieeexplore.ieee.org

We have been investigating rakugo speech synthesis as a challenging example of speech
synthesis that entertains audiences. Rakugo is a traditional Japanese form of verbal …

被引用次数：13 相关文章所有 6 个版本

[PDF] isca-archive.org

[PDF][PDF] Rakugo speech synthesis using segment-to-segment neural transduction and style tokens—toward speech synthesis for entertaining audiences

S Kato, Y Yasuda, X Wang, E Cooper… - Proc. 10th ISCA …, 2019 - isca-archive.org

We have been working on constructing rakugo speech synthesis as a challenging example
of speech synthesis that entertains audiences. Rakugo is a traditional Japanese form of …

被引用次数：8 相关文章所有 4 个版本

[PDF] ed.ac.uk

Synthesising prosody with insufficient context

Z Hodari - 2022 - era.ed.ac.uk

Prosody is a key component in human spoken communication, signalling emotion, attitude,
information structure, intention, and other communicative functions through perceived …

被引用次数：2 相关文章所有 3 个版本

[PDF] springer.com

Trustworthiness of voice-based assistants: integrating interlocutor and intermediary predictors

L Weidmüller, K Etzrodt, S Engesser - Publizistik, 2022 - Springer

When intelligent voice-based assistants (VBAs) present news, they simultaneously act as
interlocutors and intermediaries, enabling direct and mediated communication. Hence, this …

被引用次数：3 相关文章所有 3 个版本

[PDF] myrtle.ai

[PDF][PDF] Exploiting unstructured sparsity on next-generation datacenter hardware

M Ashby, C Baaij, P Baldwin, M Bastiaan, O Bunting… - 2019 - myrtle.ai

Recurrent neural networks (RNNs) form a significant proportion of data center deep learning
inference (29%[1]). This includes workloads like machine translation, speech synthesis and …

被引用次数：3 相关文章所有 3 个版本

Trustworthiness of voice-based assistants

L Weidmüller, K Etzrodt, S Engesser - tud.qucosa.de

Abstract (EN) When intelligent voice-based assistants (VBAs) present news, they
simultaneously act as interlocutors and intermediaries, enabling direct and mediated …

[PDF] soken.ac.jp

[PDF][PDF] Rakugo Speech Synthesis: Toward Speech Synthesis That Entertains Audiences

K Shuhei - 2021 - ir.soken.ac.jp

Conventional speech synthesis research has focused on transferring information which the
speech should have, such as content and speakers' emotions, personality, intention …

[PDF][PDF] Modeling of Rakugo Speech and Its Various Speaking Styles: Toward Speech Synthesis That Entertains Audiences

S KATO, Y YASUDA - 2019 - researchgate.net

We have been working on building rakugo speech synthesis as a challenging example of
speech synthesis that entertains audiences. Rakugo is a traditional Japanese form of verbal …

高级搜索

QQ 群