Deep clustering: Discriminative embeddings for segmentation and separation JR Hershey, Z Chen, J Le Roux, S Watanabe 2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016 | 1504 | 2016 |
Wavlm: Large-scale self-supervised pre-training for full stack speech processing S Chen, C Wang, Z Chen, Y Wu, S Liu, Z Chen, J Li, N Kanda, T Yoshioka, ... IEEE Journal of Selected Topics in Signal Processing 16 (6), 1505-1518, 2022 | 1176 | 2022 |
Dual-path rnn: efficient long sequence modeling for time-domain single-channel speech separation Y Luo, Z Chen, T Yoshioka ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 730 | 2020 |
Deep attractor network for single-microphone speaker separation Z Chen, Y Luo, N Mesgarani arXiv preprint arXiv:1611.08930, 2016 | 497 | 2016 |
Single-channel multi-speaker separation using deep clustering Y Isik, JL Roux, Z Chen, S Watanabe, JR Hershey arXiv preprint arXiv:1607.02173, 2016 | 478 | 2016 |
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers C Wang, S Chen, Y Wu, Z Zhang, L Zhou, S Liu, Z Chen, Y Liu, H Wang, ... arXiv preprint arXiv:2301.02111, 2023 | 361 | 2023 |
Speaker-independent speech separation with deep attractor network Y Luo, Z Chen, N Mesgarani IEEE/ACM Transactions on Audio, Speech, and Language Processing 26 (4), 787-796, 2018 | 283 | 2018 |
Continuous speech separation: Dataset and analysis Z Chen, T Yoshioka, L Lu, T Zhou, Z Meng, Y Luo, J Wu, X Xiao, J Li ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 205 | 2020 |
Deep clustering and conventional networks for music separation: Stronger together Y Luo, Z Chen, JR Hershey, J Le Roux, N Mesgarani 2017 IEEE international conference on acoustics, speech and signal …, 2017 | 205 | 2017 |
End-to-end attention based text-dependent speaker verification SX Zhang, Z Chen, Y Zhao, J Li, Y Gong 2016 IEEE Spoken Language Technology Workshop (SLT), 171-178, 2016 | 204 | 2016 |
Integration of speech enhancement and recognition using long-short term memory recurrent neural network Z Chen, S Watanabe, H Erdogan, J Hershey Proc. Interspeech, 1-7, 2015 | 180 | 2015 |
End-to-end microphone permutation and number invariant multi-channel speech separation Y Luo, Z Chen, N Mesgarani, T Yoshioka ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 166 | 2020 |
Speaker-invariant training via adversarial learning Z Meng, J Li, Z Chen, Y Zhao, V Mazalov, Y Gong, BH Juang 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 143 | 2018 |
Multi-Channel Overlapped Speech Recognition with Location Guided Speech Extraction Network Z Chen, X Xiao, T Yoshioka, H Erdogan, J Li, Y Gong 2018 IEEE Spoken Language Technology Workshop (SLT), 558-565, 2018 | 135 | 2018 |
Multi-Microphone Neural Speech Separation for Far-Field Multi-Talker Speech Recognition T Yoshioka, H Erdogan, Z Chen, F Alleva ICASSP 2018, 2018 | 133 | 2018 |
BEATs: Audio Pre-Training with Acoustic Tokenizers S Chen, Y Wu, C Wang, S Liu, D Tompkins, Z Chen, F Wei arXiv preprint arXiv:2212.09058, 2022 | 131 | 2022 |
Continuous speech separation with conformer S Chen, Y Wu, Z Chen, J Wu, J Li, T Yoshioka, C Wang, S Liu, M Zhou ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 130 | 2021 |
Neural decoding of attentional selection in multi-speaker environments without access to clean sources J O’Sullivan, Z Chen, J Herrero, GM McKhann, SA Sheth, AD Mehta, ... Journal of Neural Engineering 14 (5), 056001, 2017 | 128 | 2017 |
Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks T Yoshioka, H Erdogan, Z Chen, X Xiao, F Alleva Interspeech 2018, 2018 | 95 | 2018 |
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling Z Zhang, L Zhou, C Wang, S Chen, Y Wu, S Liu, Z Chen, Y Liu, H Wang, ... arXiv preprint arXiv:2303.03926, 2023 | 86 | 2023 |