Linguistic analysis of human-computer interaction

G Zellou, N Holliday - Frontiers in Computer Science, 2024 - frontiersin.org
This article reviews recent literature investigating speech variation in production and
comprehension during spoken language communication between humans and devices …

[HTML][HTML] The clear speech intelligibility benefit for text-to-speech voices: Effects of speaking style and visual guise

NB Aoki, M Cohn, G Zellou - JASA Express Letters, 2022 - pubs.aip.org
This study examined how speaking style and guise influence the intelligibility of text-to-
speech (TTS) and naturally produced human voices. Results showed that TTS voices were …

Modeling of Rakugo speech and its limitations: Toward speech synthesis that entertains audiences

S Kato, Y Yasuda, X Wang, E Cooper, S Takaki… - IEEE …, 2020 - ieeexplore.ieee.org
We have been investigating rakugo speech synthesis as a challenging example of speech
synthesis that entertains audiences. Rakugo is a traditional Japanese form of verbal …

[PDF][PDF] Rakugo speech synthesis using segment-to-segment neural transduction and style tokens—toward speech synthesis for entertaining audiences

S Kato, Y Yasuda, X Wang, E Cooper… - Proc. 10th ISCA …, 2019 - isca-archive.org
We have been working on constructing rakugo speech synthesis as a challenging example
of speech synthesis that entertains audiences. Rakugo is a traditional Japanese form of …

Synthesising prosody with insufficient context

Z Hodari - 2022 - era.ed.ac.uk
Prosody is a key component in human spoken communication, signalling emotion, attitude,
information structure, intention, and other communicative functions through perceived …

Trustworthiness of voice-based assistants: integrating interlocutor and intermediary predictors

L Weidmüller, K Etzrodt, S Engesser - Publizistik, 2022 - Springer
When intelligent voice-based assistants (VBAs) present news, they simultaneously act as
interlocutors and intermediaries, enabling direct and mediated communication. Hence, this …

[PDF][PDF] Exploiting unstructured sparsity on next-generation datacenter hardware

M Ashby, C Baaij, P Baldwin, M Bastiaan, O Bunting… - 2019 - myrtle.ai
Recurrent neural networks (RNNs) form a significant proportion of data center deep learning
inference (29%[1]). This includes workloads like machine translation, speech synthesis and …

Trustworthiness of voice-based assistants

L Weidmüller, K Etzrodt, S Engesser - tud.qucosa.de
Abstract (EN) When intelligent voice-based assistants (VBAs) present news, they
simultaneously act as interlocutors and intermediaries, enabling direct and mediated …

[PDF][PDF] Rakugo Speech Synthesis: Toward Speech Synthesis That Entertains Audiences

K Shuhei - 2021 - ir.soken.ac.jp
Conventional speech synthesis research has focused on transferring information which the
speech should have, such as content and speakers' emotions, personality, intention …

[PDF][PDF] Modeling of Rakugo Speech and Its Various Speaking Styles: Toward Speech Synthesis That Entertains Audiences

S KATO, Y YASUDA - 2019 - researchgate.net
We have been working on building rakugo speech synthesis as a challenging example of
speech synthesis that entertains audiences. Rakugo is a traditional Japanese form of verbal …