Espresso: A fast end-to-end neural speech recognition toolkit Y Wang, T Chen, H Xu, S Ding, H Lv, Y Shao, N Peng, L Xie, S Watanabe, ... 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 88 | 2019 |
Speaker diarization with region proposal network Z Huang, S Watanabe, Y Fujita, P García, Y Shao, D Povey, S Khudanpur ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 73 | 2020 |
PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR Y Shao, Y Wang, D Povey, S Khudanpur Proc. Interspeech 2020, 561-565, 2020 | 47 | 2020 |
Adversarial attacks and defenses for speech recognition systems P Żelasko, S Joshi, Y Shao, J Villalba, J Trmal, N Dehak, S Khudanpur arXiv preprint arXiv:2103.17122, 2021 | 28 | 2021 |
Using ASR methods for OCR A Arora, CC Chang, B Rekabdar, B BabaAli, D Povey, D Etter, D Raj, ... 2019 International Conference on Document Analysis and Recognition (ICDAR …, 2019 | 24 | 2019 |
Multi-channel multi-speaker ASR using 3D spatial feature Y Shao, SX Zhang, D Yu ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 12 | 2022 |
Use of pitch continuity for robust speech activity detection Y Shao, Q Lin 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 11 | 2018 |
Defense against adversarial attacks on hybrid speech recognition using joint adversarial fine-tuning with denoiser S Joshi, S Kataria, Y Shao, P Zelasko, J Villalba, S Khudanpur, N Dehak arXiv preprint arXiv:2204.03851, 2022 | 10 | 2022 |
A Novel Normalization Method for Autocorrelation Function for Pitch Detection and for Speech Activity Detection. Q Lin, Y Shao Interspeech, 2097-2101, 2018 | 7 | 2018 |
Unix-encoder: A universal x-channel speech encoder for ad-hoc microphone array speech processing Z Huang, Y Shao, SX Zhang, D Yu ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 2 | 2024 |
RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR Y Shao, SX Zhang, D Yu arXiv preprint arXiv:2311.00146, 2023 | 1 | 2023 |
Chunking Defense for Adversarial Attacks on ASR Y Shao, J Villalba, S Joshi, S Kataria, S Khudanpur, N Dehak Proc. Interspeech 2022, 2022 | 1 | 2022 |
Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment Y Shao, SX Zhang, Y Xu, M Yu, D Yu, D Povey, S Khudanpur arXiv preprint arXiv:2406.09589, 2024 | | 2024 |
Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset Y Shao arXiv preprint arXiv:2310.03901, 2023 | | 2023 |