Dynamic frequency feature selection based approach for classification of motor imageries J Luo, Z Feng, J Zhang, N Lu Computers in biology and medicine 75, 45-53, 2016 | 82 | 2016 |
A target guided subband filter for acoustic event detection in noisy environments using wavelet packets ZR Feng, Q Zhou, J Zhang, P Jiang, XW Yang IEEE/ACM transactions on audio, speech, and language processing 23 (2), 361-372, 2014 | 36 | 2014 |
Deep LSTM for large vocabulary continuous speech recognition X Tian, J Zhang, Z Ma, Y He, J Wei, P Wu, W Situ, S Li, Y Zhang arXiv preprint arXiv:1703.07090, 2017 | 27 | 2017 |
Improving rnn transducer with normalized jointer network M Huang, J Zhang, M Cai, Y Zhang, J Yao, Y You, Y He, Z Ma arXiv preprint arXiv:2011.01576, 2020 | 10 | 2020 |
Bring dialogue-context into RNN-T for streaming ASR. J Hou, J Chen, W Li, Y Tang, J Zhang, Z Ma INTERSPEECH, 2048-2052, 2022 | 7 | 2022 |
Frame stacking and retaining for recurrent neural network acoustic model X Tian, J Zhang, Z Ma, Y He, J Wei arXiv preprint arXiv:1705.05992, 2017 | 6 | 2017 |
Language-specific acoustic boundary learning for mandarin-english code-switching speech recognition Z Fan, L Dong, C Shen, Z Liang, J Zhang, L Lu, Z Ma arXiv preprint arXiv:2306.05279, 2023 | 5 | 2023 |
Seed-asr: Understanding diverse speech and contexts with llm-based speech recognition Y Bai, J Chen, J Chen, W Chen, Z Chen, C Ding, L Dong, Q Dong, Y Du, ... arXiv preprint arXiv:2407.04675, 2024 | 4 | 2024 |
SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR Z Fan, L Dong, J Zhang, L Lu, Z Ma ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 4 | 2024 |
The volcspeech system for the icassp 2022 multi-channel multi-party meeting transcription challenge C Shen, Y Liu, W Fan, B Wang, S Wen, Y Tian, J Zhang, J Yang, Z Ma ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 4 | 2022 |
Exponential moving average model in parallel speech recognition training X Tian, J Zhang, Z Ma, Y He, J Wei arXiv preprint arXiv:1703.01024, 2017 | 4 | 2017 |
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words J Ao, Y Wang, X Tian, D Chen, J Zhang, L Lu, Y Wang, H Li, Z Wu arXiv preprint arXiv:2406.13340, 2024 | 3 | 2024 |
Improving large-scale deep biasing with phoneme features and text-only data in streaming transducer J Qiu, L Huang, B Li, J Zhang, L Lu, Z Ma 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023 | 3 | 2023 |
Token-level speaker change detection using speaker difference and speech content via continuous integrate-and-fire Z Fan, Z Liang, L Dong, Y Liu, S Zhou, M Cai, J Zhang, Z Ma, B Xu arXiv preprint arXiv:2211.09381, 2022 | 3 | 2022 |
HMM-Free Encoder Pre-Training for Streaming RNN Transducer L Huang, J Sun, Y Tang, J Hou, J Chen, J Zhang, Z Ma Proc. Interspeech 2021, 2021, 2021 | 3 | 2021 |
Dynamic latency speech recognition with asynchronous revision M Huang, M Cai, J Zhang, Y Zhang, Y You, Y He, Z Ma arXiv preprint arXiv:2011.01570, 2020 | 3 | 2020 |
Asynchronous motor imagery detection based on a target guided sub-band filter using wavelet packets Y Sun, Z Feng, J Zhang, Q Zhou, J Luo 2017 29th Chinese Control And Decision Conference (CCDC), 4850-4855, 2017 | 3 | 2017 |
Can Large Language Models Understand Spatial Audio? C Tang, W Yu, G Sun, X Chen, T Tan, W Li, J Zhang, L Lu, Z Ma, Y Wang, ... arXiv preprint arXiv:2406.07914, 2024 | 2 | 2024 |
Cif-pt: Bridging speech and text representations for spoken language understanding via continuous integrate-and-fire pre-training L Dong, Z An, P Wu, J Zhang, L Lu, Z Ma arXiv preprint arXiv:2305.17499, 2023 | 1 | 2023 |
Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation S Wang, W Yu, Y Yang, C Tang, Y Li, J Zhuang, X Chen, X Tian, J Zhang, ... arXiv preprint arXiv:2409.16644, 2024 | | 2024 |