Improving GANs for speech enhancement

A Wali, Z Alamgir, S Karim, A Fawaz, MB Ali… - Computer Speech & …, 2022 - Elsevier

Generative adversarial networks (GANs) have seen remarkable progress in recent years.
They are used as generative models for all kinds of data such as text, images, audio, music …

被引用次数：48 相关文章所有 2 个版本

[PDF] arxiv.org

Real time speech enhancement in the waveform domain

A Defossez, G Synnaeve, Y Adi - arXiv preprint arXiv:2006.12847, 2020 - arxiv.org

We present a causal speech enhancement model working on the raw waveform that runs in
real-time on a laptop CPU. The proposed model is based on an encoder-decoder …

被引用次数：432 相关文章所有 8 个版本

[PDF] arxiv.org

Conditional diffusion probabilistic model for speech enhancement

YJ Lu, ZQ Wang, S Watanabe… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Speech enhancement is a critical component of many user-oriented audio applications, yet
current systems still suffer from distorted and unnatural outputs. While generative models …

被引用次数：123 相关文章所有 7 个版本

[HTML] springer.com

[HTML][HTML] Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer

Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

被引用次数：13 相关文章所有 8 个版本

[PDF] arxiv.org

Universal speech enhancement with score-based diffusion

J Serrà, S Pascual, J Pons, RO Araz… - arXiv preprint arXiv …, 2022 - arxiv.org

Removing background noise from speech audio has been the subject of considerable effort,
especially in recent years due to the rise of virtual communication and amateur recordings …

被引用次数：74 相关文章所有 4 个版本

[PDF] arxiv.org

HiFi-GAN: High-fidelity denoising and dereverberation based on speech deep features in adversarial networks

J Su, Z Jin, A Finkelstein - arXiv preprint arXiv:2006.05694, 2020 - arxiv.org

Real-world audio recordings are often degraded by factors such as noise, reverberation,
and equalization distortion. This paper introduces HiFi-GAN, a deep learning method to …

被引用次数：155 相关文章所有 10 个版本

[PDF] arxiv.org

Speech denoising in the waveform domain with self-attention

Z Kong, W Ping, A Dantrey… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org

In this work, we present CleanUNet, a causal speech denoising model on the raw waveform.
The proposed model is based on an encoder-decoder architecture combined with several …

被引用次数：47 相关文章所有 3 个版本

[PDF] arxiv.org

Cmgan: Conformer-based metric-gan for monaural speech enhancement

S Abdulatif, R Cao, B Yang - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org

In this work, we further develop the conformer-based metric generative adversarial network
(CMGAN) model 1 for speech enhancement (SE) in the time-frequency (TF) domain. This …

被引用次数：26 相关文章所有 8 个版本

[PDF] ieee.org

A time-frequency attention module for neural speech enhancement

Q Zhang, X Qian, Z Ni, A Nicolson… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org

Speech enhancement plays an essential role in a wide range of speech processing
applications. Recent studies on speech enhancement tend to investigate how to effectively …

被引用次数：24 相关文章所有 2 个版本

[PDF] aaai.org

Revisiting denoising diffusion probabilistic models for speech enhancement: Condition collapse, efficiency and refinement

W Tai, F Zhou, G Trajcevski, T Zhong - Proceedings of the AAAI …, 2023 - ojs.aaai.org

Recent literature has shown that denoising diffusion probabilistic models (DDPMs) can be
used to synthesize high-fidelity samples with a competitive (or sometimes better) quality than …

被引用次数：11 相关文章所有 3 个版本

高级搜索

QQ 群