Training speech enhancement systems with noisy speech datasets

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer

Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

被引用次数：26 相关文章所有 8 个版本

[PDF] ieee.org

Remixit: Continual self-training of speech enhancement models via bootstrapped remixing

E Tzinis, Y Adi, VK Ithapu, B Xu… - IEEE Journal of …, 2022 - ieeexplore.ieee.org

We present RemixIT, a simple yet effective self-supervised method for training speech
enhancement without the need of a single isolated in-domain speech nor a noise waveform …

被引用次数：56 相关文章所有 5 个版本

[PDF] arxiv.org

Unsupervised speech enhancement with speech recognition embedding and disentanglement losses

VA Trinh, S Braun - ICASSP 2022-2022 IEEE International …, 2022 - ieeexplore.ieee.org

Speech enhancement has recently achieved great success with various deep learning
methods. However, most conventional speech enhancement systems are trained with …

被引用次数：24 相关文章所有 4 个版本

[PDF] arxiv.org

Diffusion-based speech enhancement with joint generative and predictive decoders

H Shi, K Shimada, M Hirano, T Shibuya… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

Diffusion-based generative speech enhancement (SE) has recently received attention, but
reverse diffusion remains time-consuming. One solution is to initialize the reverse diffusion …

被引用次数：15 相关文章所有 6 个版本

[PDF] arxiv.org

Continual self-training with bootstrapped remixing for speech enhancement

E Tzinis, Y Adi, VK Ithapu, B Xu… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

We propose RemixIT, a simple and novel self-supervised training method for speech
enhancement. The proposed method is based on a continuously self-training scheme that …

被引用次数：19 相关文章所有 4 个版本

[PDF] arxiv.org

Efficient personalized speech enhancement through self-supervised learning

A Sivaraman, M Kim - IEEE Journal of Selected Topics in Signal …, 2022 - ieeexplore.ieee.org

This work presents self-supervised learning methods for monaural speaker-specific (ie,
personalized) speech enhancement models. While general-purpose models must broadly …

被引用次数：23 相关文章所有 5 个版本

[PDF] ieee.org

Large-scale unsupervised audio pre-training for video-to-speech synthesis

T Kefalas, Y Panagakis, M Pantic - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org

Video-to-speech synthesis is the task of reconstructing the speech signal from a silent video
of a speaker. Previous approaches train on data from almost exclusively audio-visual …

被引用次数：1 相关文章所有 5 个版本

Power Normalized Gammachirp Cepstral (PNGC) coefficients-based approach for robust speaker recognition

Y Zouhir, M Zarka, K Ouni - Applied Acoustics, 2023 - Elsevier

Speaker identification or recognition task aims to identify persons from their voices. This
paper introduces a new feature extraction approach for robust speaker recognition named …

被引用次数：4 相关文章

[PDF] arxiv.org

Self-supervised speech denoising using only noisy audio signals

J Wu, Q Li, G Yang, L Li, L Senhadji, H Shu - Speech Communication, 2023 - Elsevier

In traditional speech denoising tasks, clean audio signals are often used as the training
target, but absolutely clean signals are collected from expensive recording equipment or in …

被引用次数：14 相关文章所有 7 个版本

[HTML] mdpi.com

[HTML][HTML] Using Deep Learning to Classify Environmental Sounds in the Habitat of Western Black-Crested Gibbons

R Hu, K Hu, L Wang, Z Guan, X Zhou, N Wang, L Ye - Diversity, 2024 - mdpi.com

The western black-crested gibbon (Nomascus concolor) is a rare and endangered primate
that inhabits southern China and northern Vietnam, and has become a key conservation …

高级搜索

QQ 群