Adaptive Neural Speech Enhancement with a Denoising Variational Autoencoder.

[HTML][HTML] An overview of variational autoencoders for source separation, finance, and bio-signal applications

A Singh, T Ogunfunmi - Entropy, 2021 - mdpi.com

Autoencoders are a self-supervised learning system where, during training, the output is an
approximation of the input. Typically, autoencoders have three parts: Encoder (which …

被引用次数：53 相关文章所有 7 个版本

[PDF] arxiv.org

Speech enhancement and dereverberation with diffusion-based generative models

J Richter, S Welker, JM Lemercier… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

In this work, we build upon our previous publication and use diffusion-based generative
models for speech enhancement. We present a detailed overview of the diffusion process …

被引用次数：110 相关文章所有 4 个版本

[PDF] arxiv.org

Speech enhancement with score-based generative models in the complex STFT domain

S Welker, J Richter, T Gerkmann - arXiv preprint arXiv:2203.17004, 2022 - arxiv.org

Score-based generative models (SGMs) have recently shown impressive results for difficult
generative tasks such as the unconditional and conditional generation of natural images …

被引用次数：78 相关文章所有 7 个版本

[PDF] arxiv.org

Unsupervised speech enhancement using dynamical variational autoencoders

X Bie, S Leglaive, X Alameda-Pineda… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org

Dynamical variational autoencoders (DVAEs) are a class of deep generative models with
latent variables, dedicated to model time series of high-dimensional data. DVAEs can be …

被引用次数：48 相关文章所有 17 个版本

[PDF] arxiv.org

RF-Diffusion: Radio Signal Generation via Time-Frequency Diffusion

G Chi, Z Yang, C Wu, J Xu, Y Gao, Y Liu… - Proceedings of the 30th …, 2024 - dl.acm.org

Along with AIGC shines in CV and NLP, its potential in the wireless domain has also
emerged in recent years. Yet, existing RF-oriented generative solutions are ill-suited for …

被引用次数：2 相关文章所有 3 个版本

[PDF] thecvf.com

MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models

S Chowdhury, S Nag, KJ Joseph… - Proceedings of the …, 2024 - openaccess.thecvf.com

Music is a universal language that can communicate emotions and feelings. It forms an
essential part of the whole spectrum of creative media ranging from movies to social media …

被引用次数：1 相关文章所有 3 个版本

[HTML] aip.org

[HTML][HTML] Deep learning-based stereophonic acoustic echo suppression without decorrelation

L Cheng, R Peng, A Li, C Zheng, X Li - The Journal of the Acoustical …, 2021 - pubs.aip.org

Traditional stereophonic acoustic echo cancellation algorithms need to estimate acoustic
echo paths from stereo loudspeakers to a microphone, which often suffers from the …

被引用次数：15 相关文章所有 5 个版本

[PDF] arxiv.org

A multi-dimensional deep structured state space approach to speech enhancement using small-footprint models

PJ Ku, CHH Yang, SM Siniscalchi, CH Lee - arXiv preprint arXiv …, 2023 - arxiv.org

We propose a multi-dimensional structured state space (S4) approach to speech
enhancement. To better capture the spectral dependencies across the frequency axis, we …

被引用次数：6 相关文章所有 6 个版本

[PDF] arxiv.org

Disentanglement learning for variational autoencoders applied to audio-visual speech enhancement

G Carbajal, J Richter… - 2021 IEEE Workshop on …, 2021 - ieeexplore.ieee.org

Recently, the standard variational autoencoder has been successfully used to learn a
probabilistic prior over speech signals, which is then used to perform speech enhancement …

被引用次数：16 相关文章所有 5 个版本

[PDF] arxiv.org

Variance-preserving-Based interpolation diffusion models for speech enhancement

Z Guo, J Du, CH Lee, Y Gao, W Zhang - arXiv preprint arXiv:2306.08527, 2023 - arxiv.org

The goal of this study is to implement diffusion models for speech enhancement (SE). The
first step is to emphasize the theoretical foundation of variance-preserving (VP)-based …

被引用次数：5 相关文章所有 5 个版本

高级搜索

QQ 群