- 学术资源搜索

Audio self-supervised learning: A survey

S Liu, A Mallol-Ragolta, E Parada-Cabaleiro, K Qian… - Patterns, 2022 - cell.com

Similar to humans' cognitive ability to generalize knowledge and skills, self-supervised
learning (SSL) targets discovering general representations from large-scale data. This …

被引用次数：106 相关文章所有 12 个版本

[PDF] springer.com

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer

Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

被引用次数：16 相关文章所有 8 个版本

[PDF] arxiv.org

Investigating self-supervised learning for speech enhancement and separation

Z Huang, S Watanabe, S Yang, P García… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Speech enhancement and separation are two fundamental tasks for robust speech
processing. Speech enhancement suppresses background noise while speech separation …

被引用次数：64 相关文章所有 6 个版本

[PDF] nature.com

BioCPPNet: automatic bioacoustic source separation with deep neural networks

PC Bermant - Scientific Reports, 2021 - nature.com

Abstract We introduce the Bioacoustic Cocktail Party Problem Network (BioCPPNet), a
lightweight, modular, and robust U-Net-based machine learning architecture optimized for …

被引用次数：32 相关文章所有 11 个版本

[PDF] arxiv.org

Heterogeneous target speech separation

E Tzinis, G Wichern, A Subramanian… - arXiv preprint arXiv …, 2022 - arxiv.org

We introduce a new paradigm for single-channel target source separation where the
sources of interest can be distinguished using non-mutually exclusive concepts (eg …

被引用次数：19 相关文章所有 9 个版本

[PDF] arxiv.org

Efficient transformer-based speech enhancement using long frames and STFT magnitudes

D de Oliveira, T Peer, T Gerkmann - arXiv preprint arXiv:2206.11703, 2022 - arxiv.org

The SepFormer architecture shows very good results in speech separation. Like other
learned-encoder models, it uses short frames, as they have been shown to obtain better …

被引用次数：13 相关文章所有 5 个版本

[PDF] arxiv.org

Don't speak too fast: The impact of data bias on self-supervised speech models

Y Meng, YH Chou, AT Liu, H Lee - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org

Self-supervised Speech Models (S3Ms) have been proven successful in many speech
downstream tasks, like ASR. However, how pretraining data affects S3Ms' downstream …

被引用次数：22 相关文章所有 6 个版本

[PDF] arxiv.org

Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings

IC Chern, KH Hung, YT Chen, T Hussain… - … , Speech, and Signal …, 2023 - ieeexplore.ieee.org

AV-HuBERT, a multi-modal self-supervised learning model, has been shown to be effective
for categorical problems such as automatic speech recognition and lip-reading. This …

被引用次数：12 相关文章所有 4 个版本

Improving reverberant speech separation with synthetic room impulse responses

R Aralikatti, A Ratnarajah, Z Tang… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org

We present a novel approach that improves the performance of reverberant speech
separation. Our approach is based on an accurate geometric acoustic simulator (GAS) …

被引用次数：9 相关文章

[PDF] arxiv.org

Embedding recurrent layers with dual-path strategy in a variant of convolutional network for speaker-independent speech separation

X Yang, C Bao - arXiv preprint arXiv:2203.13574, 2022 - arxiv.org

Speaker-independent speech separation has achieved remarkable performance in recent
years with the development of deep neural network (DNN). Various network architectures …

被引用次数：4 相关文章所有 6 个版本

高级搜索

QQ 群

Audio self-supervised learning: A survey

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

Investigating self-supervised learning for speech enhancement and separation

BioCPPNet: automatic bioacoustic source separation with deep neural networks

Heterogeneous target speech separation

Efficient transformer-based speech enhancement using long frames and STFT magnitudes

Don't speak too fast: The impact of data bias on self-supervised speech models

Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings

Improving reverberant speech separation with synthetic room impulse responses

Embedding recurrent layers with dual-path strategy in a variant of convolutional network for speaker-independent speech separation

引用