关注
Zhaoheng Ni
Zhaoheng Ni
Meta Reality Labs
在 meta.com 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings
S Watanabe, M Mandel, J Barker, E Vincent, A Arora, X Chang, ...
arXiv preprint arXiv:2004.09249, 2020
2982020
TorchAudio: Building Blocks for Audio and Speech Processing
YY Yang, M Hira, Z Ni, A Chourdia, A Astafurov, C Chen, CF Yeh, ...
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
1642022
Scaling speech technology to 1,000+ languages
V Pratap, A Tjandra, B Shi, P Tomasello, A Babu, S Kundu, A Elkahky, ...
Journal of Machine Learning Research 25 (97), 1-52, 2024
1332024
Confused or not confused? Disentangling brain activity from EEG data using bidirectional LSTM recurrent neural networks
Z Ni, AC Yuksel, X Ni, MI Mandel, L Xie
Proceedings of the 8th acm international conference on bioinformatics …, 2017
792017
Towards low-distortion multi-channel speech enhancement: The ESPNET-SE submission to the L3DAS22 challenge
YJ Lu, S Cornell, X Chang, W Zhang, C Li, Z Ni, ZQ Wang, S Watanabe
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
242022
A time-frequency attention module for neural speech enhancement
Q Zhang, X Qian, Z Ni, A Nicolson, E Ambikairajah, H Li
IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 462-475, 2022
212022
Time-frequency attention for monaural speech enhancement
Q Zhang, Q Song, Z Ni, A Nicolson, H Li
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
212022
Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute
W Chen, X Chang, Y Peng, Z Ni, S Maiti, S Watanabe
Proc. INTERSPEECH 2023, 4404--4408, 2023
182023
ESPnet-SE++: Speech enhancement for robust speech recognition, translation, and understanding
YJ Lu, X Chang, C Li, W Zhang, S Cornell, Z Ni, Y Masuyama, B Yan, ...
Interspeech 2022, 5458--5462, 2022
182022
Torchaudio-squim: Reference-less speech quality and intelligibility measures in torchaudio
A Kumar, K Tan, Z Ni, P Manocha, X Zhang, E Henderson, B Xu
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
172023
Anatomical entity recognition with a hierarchical framework augmented by external resources
Y Xu, J Hua, Z Ni, Q Chen, Y Fan, S Ananiadou, EIC Chang, J Tsujii
PloS one 9 (10), e108396, 2014
132014
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
B Yan, J Shi, Y Tang, H Inaguma, Y Peng, S Dalmia, P Polák, ...
Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023
72023
WPD++: An improved neural beamformer for simultaneous speech separation and dereverberation
Z Ni, Y Xu, M Yu, B Wu, S Zhang, D Yu, MI Mandel
2021 IEEE Spoken Language Technology Workshop (SLT), 817-824, 2021
72021
Onssen: an open-source speech separation and enhancement library
Z Ni, MI Mandel
arXiv preprint arXiv:1911.00982, 2019
62019
Stack-and-delay: a new codebook pattern for music generation
G Le Lan, V Nagaraja, E Chang, D Kant, Z Ni, Y Shi, F Iandola, V Chandra
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
52024
Mask-dependent phase estimation for monaural speaker separation
Z Ni, MI Mandel
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
52020
TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch
J Hwang, M Hira, C Chen, X Zhang, Z Ni, G Sun, P Ma, R Huang, V Pratap, ...
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-9, 2023
42023
Combining spatial clustering with LSTM speech models for multichannel speech enhancement
F Grezes, Z Ni, VA Trinh, M Mandel
arXiv preprint arXiv:2012.03388, 2020
42020
FoleyGen: Visually-Guided Audio Generation
X Mei, V Nagaraja, GL Lan, Z Ni, E Chang, Y Shi, V Chandra
arXiv preprint arXiv:2309.10537, 2023
32023
Ripple sparse self-attention for monaural speech enhancement
Q Zhang, H Zhu, Q Song, X Qian, Z Ni, H Li
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
32023
系统目前无法执行此操作,请稍后再试。
文章 1–20