Deep representation learning in speech processing: Challenges, recent advances, and future trends

A Mohamed, H Lee, L Borgholt… - IEEE Journal of …, 2022 - ieeexplore.ieee.org

Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …

被引用次数：268 相关文章所有 10 个版本

Machine Learning, Deep Learning and Statistical Analysis for forecasting building energy consumption—A systematic review

M Khalil, AS McGough, Z Pourmirza… - … Applications of Artificial …, 2022 - Elsevier

The building sector accounts for 36% of the total global energy usage and 40% of
associated Carbon Dioxide emissions. Therefore, the forecasting of building energy …

被引用次数：98 相关文章所有 5 个版本

[PDF] arxiv.org

Self-supervised learning for time series analysis: Taxonomy, progress, and prospects

K Zhang, Q Wen, C Zhang, R Cai, M Jin… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Self-supervised learning (SSL) has recently achieved impressive performance on various
time series tasks. The most prominent advantage of SSL is that it reduces the dependence …

被引用次数：49 相关文章所有 7 个版本

[PDF] arxiv.org

Seen and unseen emotional style transfer for voice conversion with a new emotional speech dataset

K Zhou, B Sisman, R Liu, H Li - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org

Emotional voice conversion aims to transform emotional prosody in speech while preserving
the linguistic content and speaker identity. Prior studies show that it is possible to …

被引用次数：170 相关文章所有 8 个版本

[PDF] sciencedirect.com

Emotional voice conversion: Theory, databases and ESD

K Zhou, B Sisman, R Liu, H Li - Speech Communication, 2022 - Elsevier

In this paper, we first provide a review of the state-of-the-art emotional voice conversion
research, and the existing emotional speech databases. We then motivate the development …

被引用次数：122 相关文章所有 7 个版本

Speech technology for healthcare: Opportunities, challenges, and state of the art

S Latif, J Qadir, A Qayyum, M Usama… - IEEE Reviews in …, 2020 - ieeexplore.ieee.org

Speech technology is not appropriately explored even though modern advances in speech
technology—especially those driven by deep learning (DL) technology—offer …

被引用次数：124 相关文章所有 3 个版本

[PDF] arxiv.org

Towards learning a universal non-semantic representation of speech

J Shor, A Jansen, R Maor, O Lang, O Tuval… - arXiv preprint arXiv …, 2020 - arxiv.org

The ultimate goal of transfer learning is to reduce labeled data requirements by exploiting a
pre-existing embedding model trained for different datasets or tasks. The visual and …

被引用次数：153 相关文章所有 9 个版本

Conventional and contemporary approaches used in text to speech synthesis: A review

N Kaur, P Singh - Artificial Intelligence Review, 2023 - Springer

Nowadays speech synthesis or text to speech (TTS), an ability of system to produce human
like natural sounding voice from the written text, is gaining popularity in the field of speech …

被引用次数：35 相关文章所有 3 个版本

[PDF] arxiv.org

Beyond just vision: A review on self-supervised representation learning on multimodal and temporal data

S Deldari, H Xue, A Saeed, J He, DV Smith… - arXiv preprint arXiv …, 2022 - arxiv.org

Recently, Self-Supervised Representation Learning (SSRL) has attracted much attention in
the field of computer vision, speech, natural language processing (NLP), and recently, with …

被引用次数：34 相关文章所有 2 个版本

[PDF] springer.com

Multi-channel spectrograms for speech processing applications using deep learning methods

T Arias-Vergara, P Klumpp, JC Vasquez-Correa… - Pattern Analysis and …, 2021 - Springer

Time–frequency representations of the speech signals provide dynamic information about
how the frequency component changes with time. In order to process this information, deep …

被引用次数：73 相关文章所有 8 个版本

高级搜索

QQ 群