AST: Audio Spectrogram Transformer Y Gong, YA Chung, J Glass Interspeech 2021, 2021 | 763 | 2021 |
Second-order non-local attention networks for person re-identification BN Xia, Y Gong, Y Zhang, C Poellabauer ICCV 2019, 3760-3769, 2019 | 236 | 2019 |
SSAST: Self-Supervised Audio Spectrogram Transformer Y Gong, CIJ Lai, YA Chung, J Glass AAAI 2022, 2022 | 232 | 2022 |
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation Y Gong, YA Chung, J Glass IEEE Transactions on Audio, Speech, and Language Processing, 2021 | 151 | 2021 |
Topic modeling based multi-modal depression detection Y Gong, C Poellabauer Proceedings of the 7th annual workshop on Audio/Visual emotion challenge, 69-76, 2017 | 140 | 2017 |
Crafting adversarial examples for speech paralinguistics applications Y Gong, C Poellabauer Proceedings of 2018 DYnamic and Novel Advances in Machine Learning and …, 2017 | 125 | 2017 |
Contrastive Audio-Visual Masked Autoencoder Y Gong, A Rouditchenko, AH Liu, D Harwath, L Karlinsky, H Kuehne, ... ICLR 2023, 2022 | 85 | 2022 |
Real-time Adversarial Attacks Y Gong, B Li, C Poellabauer, Y Shi IJCAI 2019, 2019 | 62 | 2019 |
Listen, Think, and Understand Y Gong, H Luo, AH Liu, L Karlinsky, J Glass ICLR 2024, 2023 | 59 | 2023 |
Transformer-based multi-aspect multi-granularity non-native english speaker pronunciation assessment Y Gong, Z Chen, IH Chu, P Chang, J Glass ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 46 | 2022 |
ReMASC: realistic replay attack corpus for voice controlled systems Y Gong, J Yang, J Huber, M MacKnight, C Poellabauer Interspeech 2019, 2019 | 42 | 2019 |
An overview of vulnerabilities of voice controlled systems Y Gong, C Poellabauer 1st International Workshop on Security and Privacy for the Internet-of …, 2018 | 39 | 2018 |
Protecting voice controlled systems using sound source identification based on acoustic cues Y Gong, C Poellabauer 2018 27th International Conference on Computer Communication and Networks …, 2018 | 38 | 2018 |
Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers Y Gong, S Khurana, L Karlinsky, J Glass Interspeech 2023, 2023 | 36 | 2023 |
Detecting replay attacks using multi-channel audio: A neural network-based method Y Gong, J Yang, C Poellabauer IEEE Signal Processing Letters 27, 920-924, 2020 | 31 | 2020 |
Cmkd: Cnn/transformer-based cross-model knowledge distillation for audio classification Y Gong, S Khurana, A Rouditchenko, J Glass arXiv preprint arXiv:2203.06760, 2022 | 30 | 2022 |
Vocalsound: A dataset for improving human vocal sounds recognition Y Gong, J Yu, J Glass ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 23 | 2022 |
Impact of Aliasing on Deep CNN-Based End-to-End Acoustic Models Y Gong, C Poellabauer Interspeech 2018, 2698-2702, 2018 | 23 | 2018 |
Search augmented instruction learning H Luo, T Zhang, YS Chuang, Y Gong, Y Kim, X Wu, H Meng, J Glass Findings of the Association for Computational Linguistics: EMNLP 2023, 3717-3729, 2023 | 21* | 2023 |
Automatic autism spectrum disorder detection using everyday vocalizations captured by smart devices Y Gong, H Yatawatte, C Poellabauer, S Schneider, S Latham Proceedings of the 2018 ACM international conference on bioinformatics …, 2018 | 17 | 2018 |