A study of learning based beamforming methods for speech recognition

Z Zhang, J Geiger, J Pohjalainen, AED Mousa… - ACM Transactions on …, 2018 - dl.acm.org

Eliminating the negative effect of non-stationary environmental noise is a long-standing
research topic for automatic speech recognition but still remains an important challenge …

被引用次数：427 相关文章所有 10 个版本

[PDF] mlr.press

Voice separation with an unknown number of multiple speakers

E Nachmani, Y Adi, L Wolf - International Conference on …, 2020 - proceedings.mlr.press

We present a new method for separating a mixed audio sequence, in which multiple voices
speak simultaneously. The new method employs gated neural networks that are trained to …

被引用次数：196 相关文章所有 6 个版本

[PDF] ieee.org

Spex: Multi-scale time domain speaker extraction network

C Xu, W Rao, ES Chng, H Li - IEEE/ACM transactions on audio …, 2020 - ieeexplore.ieee.org

Speaker extraction aims to mimic humans' selective auditory attention by extracting a target
speaker's voice from a multi-talker environment. It is common to perform the extraction in …

被引用次数：193 相关文章所有 6 个版本

[PDF] arxiv.org

End-to-end microphone permutation and number invariant multi-channel speech separation

Y Luo, Z Chen, N Mesgarani… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

An important problem in ad-hoc microphone speech separation is how to guarantee the
robustness of a system with respect to the locations and numbers of microphones. The …

被引用次数：194 相关文章所有 3 个版本

[PDF] arxiv.org

ADL-MVDR: All deep learning MVDR beamformer for target speech separation

Z Zhang, Y Xu, M Yu, SX Zhang… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

Speech separation algorithms are often used to separate the target speech from other
interfering sources. However, purely neural network based speech separation systems often …

被引用次数：146 相关文章所有 7 个版本

[PDF] arxiv.org

FaSNet: Low-latency adaptive beamforming for multi-microphone audio processing

Y Luo, C Han, N Mesgarani, E Ceolini… - 2019 IEEE automatic …, 2019 - ieeexplore.ieee.org

Beamforming has been extensively investigated for multi-channel audio processing tasks.
Recently, learning-based beamforming methods, sometimes called neural beamformers …

被引用次数：166 相关文章所有 6 个版本

[PDF] merl.com

Unified architecture for multichannel end-to-end speech recognition with neural beamforming

T Ochiai, S Watanabe, T Hori… - IEEE Journal of …, 2017 - ieeexplore.ieee.org

This paper proposes a unified architecture for end-to-end automatic speech recognition
(ASR) to encompass microphone-array signal processing such as a state-of-the-art neural …

被引用次数：106 相关文章所有 7 个版本

[PDF] mlr.press

Multichannel end-to-end speech recognition

T Ochiai, S Watanabe, T Hori… - … conference on machine …, 2017 - proceedings.mlr.press

The field of speech recognition is in the midst of a paradigm shift: end-to-end neural
networks are challenging the dominance of hidden Markov models as a core technology …

被引用次数：126 相关文章所有 14 个版本

[PDF] uni-paderborn.de

End-to-end dereverberation, beamforming, and speech recognition in a cocktail party

W Zhang, X Chang, C Boeddeker… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org

Far-field multi-speaker automatic speech recognition (ASR) has drawn increasing attention
in recent years. Most existing methods feature a signal processing frontend and an ASR …

被引用次数：19 相关文章所有 5 个版本

[PDF] arxiv.org

Time-domain speaker extraction network

C Xu, W Rao, ES Chng, H Li - 2019 IEEE Automatic Speech …, 2019 - ieeexplore.ieee.org

Speaker extraction is to extract a target speaker's voice from multi-talker speech. It simulates
humans' cocktail party effect or the selective listening ability. The prior work mostly performs …

被引用次数：64 相关文章所有 7 个版本

高级搜索

QQ 群