An empirical study of Conv-TasNet

J Cosentino, M Pariente, S Cornell, A Deleforge… - arXiv preprint arXiv …, 2020 - arxiv.org

In recent years, wsj0-2mix has become the reference dataset for single-channel speech
separation. Most deep learning-based speech separation models today are benchmarked …

被引用次数：306 相关文章所有 5 个版本

Cardiopulmonary auscultation enhancement with a two-stage noise cancellation approach

C Yang, N Dai, Z Wang, S Cai, J Wang, N Hu - … Signal Processing and …, 2023 - Elsevier

For cardiopulmonary auscultation using electronic stethoscopes, signal quality is a key
point. During signal acquisition various background sounds may be inevitably captured …

被引用次数：10 相关文章

[PDF] arxiv.org

On loss functions and evaluation metrics for music source separation

E Gusó, J Pons, S Pascual… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org

We investigate which loss functions provide better separations via benchmarking an
extensive set of those for music source separation. To that end, we first survey the most …

被引用次数：22 相关文章所有 3 个版本

[PDF] arxiv.org

LibriheavyMix: a 20,000-hour dataset for single-channel reverberant multi-talker speech separation, ASR and speaker diarization

Z Jin, Y Yang, M Shi, W Kang, X Yang, Z Yao… - arXiv preprint arXiv …, 2024 - arxiv.org

The evolving speech processing landscape is increasingly focused on complex scenarios
like meetings or cocktail parties with multiple simultaneous speakers and far-field conditions …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Gass: Generalizing audio source separation with large-scale data

J Pons, X Liu, S Pascual, J Serrà - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org

Universal source separation targets at separating the audio sources of an arbitrary mix,
removing the constraint to operate on a specific domain like speech or music. Yet, the …

被引用次数：10 相关文章所有 4 个版本

[PDF] interspeech2020.org

[PDF][PDF] Improved Speech Enhancement Using TCN with Multiple Encoder-Decoder Layers.

V Kishore, N Tiwari, P Paramasivam - Interspeech, 2020 - interspeech2020.org

A deep learning based time domain single-channel speech enhancement technique using
multilayer encoder-decoder and a temporal convolutional network is proposed for use in …

被引用次数：29 相关文章所有 5 个版本

[PDF] frontiersin.org

Att-TasNet: Attending to Encodings in Time-Domain Audio Speech Separation of Noisy, Reverberant Speech Mixtures

W Ravenscroft, S Goetze, T Hain - Frontiers in Signal Processing, 2022 - frontiersin.org

Separation of speech mixtures in noisy and reverberant environments remains a
challenging task for state-of-the-art speech separation systems. Time-domain audio speech …

被引用次数：13 相关文章所有 2 个版本

[PDF] arxiv.org

Quantitative evidence on overlooked aspects of enrollment speaker embeddings for target speaker separation

X Liu, X Li, J Serrà - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org

Single channel target speaker separation (TSS) aims at extracting a speaker's voice from a
mixture of multiple talkers given an enrollment utterance of that speaker. A typical deep …

被引用次数：11 相关文章所有 4 个版本

[PDF] ieee.org

Speaker verification based on single channel speech separation

R Jin, M Ablimit, A Hamdulla - IEEE Access, 2023 - ieeexplore.ieee.org

In multi-speaker scenarios, speech processing tasks like speaker identification and speech
recognition are susceptible to noise and overlapped voices. As the overlapped voices are a …

被引用次数：6 相关文章所有 2 个版本

[PDF] arxiv.org

PodcastMix: A dataset for separating music and speech in podcasts

N Schmidt, J Pons, M Miron - arXiv preprint arXiv:2207.07403, 2022 - arxiv.org

We introduce PodcastMix, a dataset formalizing the task of separating background music
and foreground speech in podcasts. We aim at defining a benchmark suitable for training …

被引用次数：4 相关文章所有 4 个版本

高级搜索

QQ 群