Hierarchical representation and estimation of prosody using continuous wavelet transform

X Tan, T Qin, F Soong, TY Liu - arXiv preprint arXiv:2106.15561, 2021 - arxiv.org

Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …

被引用次数：452 相关文章所有 2 个版本

A novel tracking deep wavelet auto-encoder method for intelligent fault diagnosis of electric locomotive bearings

S Haidong, J Hongkai, Z Ke, W Dongdong… - Mechanical Systems and …, 2018 - Elsevier

The condition monitoring of electric locomotive has attracted more and more attention due to
its significance for improving the security, reliability and automation level. In this paper, a …

被引用次数：112 相关文章所有 2 个版本

[PDF] springer.com

Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources

H Barakat, O Turk, C Demiroglu - EURASIP Journal on Audio, Speech, and …, 2024 - Springer

Speech synthesis has made significant strides thanks to the transition from machine learning
to deep learning models. Contemporary text-to-speech (TTS) models possess the capability …

被引用次数：10 相关文章所有 6 个版本

[PDF] nature.com

Speech prosody enhances the neural processing of syntax

G Degano, PW Donhauser, L Gwilliams… - Communications …, 2024 - nature.com

Human language relies on the correct processing of syntactic information, as it is essential
for successful communication between speakers. As an abstract level of language, syntax …

被引用次数：5 相关文章所有 6 个版本

[PDF] arxiv.org

Hierarchical prosody modeling for non-autoregressive speech synthesis

CM Chien, H Lee - 2021 IEEE Spoken Language Technology …, 2021 - ieeexplore.ieee.org

Prosody modeling is an essential component in modern text-to-speech (TTS) frameworks.
By explicitly providing prosody features to the TTS model, the style of synthesized utterances …

被引用次数：37 相关文章所有 4 个版本

[PDF] arxiv.org

Prosody-controllable spontaneous TTS with neural HMMs

H Lameris, S Mehta, GE Henter… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Spontaneous speech has many affective and pragmatic functions that are interesting and
challenging to model in TTS. However, the presence of reduced articulation, fillers …

被引用次数：18 相关文章所有 5 个版本

[HTML] sciencedirect.com

[HTML][HTML] Spoken Language Identification: An overview of past and present research trends

D O'Shaughnessy - Speech Communication, 2024 - Elsevier

Identification of the language used in spoken utterances is useful for multiple applications,
eg, assist in directing or automating telephone calls, or selecting which language-specific …

[HTML] sciencedirect.com

[HTML][HTML] Prosody and fluency of Finland Swedish as a second language: Investigating global parameters for automated speaking assessment

H Kallio, M Kautonen, M Kuronen - Speech Communication, 2023 - Elsevier

This study investigates prosody and fluency of Finland Swedish as a second language (L2).
The main objective is to investigate global measures of prosody and fluency as predictors of …

被引用次数：10 相关文章所有 3 个版本

[HTML] sciencedirect.com

[HTML][HTML] Event-related responses reflect chunk boundaries in natural speech

I Anurova, S Vetchinnikova, A Dobrego, N Williams… - NeuroImage, 2022 - Elsevier

Chunking language has been proposed to be vital for comprehension enabling the
extraction of meaning from a continuous stream of speech. However, neurocognitive …

被引用次数：15 相关文章所有 11 个版本

[PDF] arxiv.org

Predicting prosodic prominence from text with pre-trained contextualized word representations

A Talman, A Suni, H Celikkanat, S Kakouros… - arXiv preprint arXiv …, 2019 - arxiv.org

In this paper we introduce a new natural language processing dataset and benchmark for
predicting prosodic prominence from written text. To our knowledge this will be the largest …

被引用次数：36 相关文章所有 10 个版本

高级搜索

QQ 群