wav2vec 2.0: A framework for self-supervised learning of speech representations A Baevski, H Zhou, A Mohamed, M Auli arXiv preprint arXiv:2006.11477, 2020 | 4749 | 2020 |
fairseq: A fast, extensible toolkit for sequence modeling M Ott, S Edunov, A Baevski, A Fan, S Gross, N Ng, D Grangier, M Auli arXiv preprint arXiv:1904.01038, 2019 | 3014 | 2019 |
wav2vec: Unsupervised pre-training for speech recognition S Schneider, A Baevski, R Collobert, M Auli arXiv preprint arXiv:1904.05862, 2019 | 1451 | 2019 |
Data2vec: A general framework for self-supervised learning in speech, vision and language A Baevski, WN Hsu, Q Xu, A Babu, J Gu, M Auli International Conference on Machine Learning, 1298-1312, 2022 | 716 | 2022 |
Unsupervised cross-lingual representation learning for speech recognition A Conneau, A Baevski, R Collobert, A Mohamed, M Auli arXiv preprint arXiv:2006.13979, 2020 | 707 | 2020 |
vq-wav2vec: Self-supervised learning of discrete speech representations A Baevski, S Schneider, M Auli arXiv preprint arXiv:1910.05453, 2019 | 671 | 2019 |
Pay less attention with lightweight and dynamic convolutions F Wu, A Fan, A Baevski, YN Dauphin, M Auli arXiv preprint arXiv:1901.10430, 2019 | 653 | 2019 |
XLS-R: Self-supervised cross-lingual speech representation learning at scale A Babu, C Wang, A Tjandra, K Lakhotia, Q Xu, N Goyal, K Singh, ... arXiv preprint arXiv:2111.09296, 2021 | 525 | 2021 |
Adaptive input representations for neural language modeling A Baevski, M Auli arXiv preprint arXiv:1809.10853, 2018 | 417 | 2018 |
Facebook FAIR's WMT19 news translation task submission N Ng, K Yee, A Baevski, M Ott, M Auli, S Edunov arXiv preprint arXiv:1907.06616, 2019 | 408 | 2019 |
Unsupervised speech recognition A Baevski, WN Hsu, A Conneau, M Auli Advances in Neural Information Processing Systems 34, 27826-27839, 2021 | 276 | 2021 |
On generative spoken language modeling from raw audio K Lakhotia, E Kharitonov, WN Hsu, Y Adi, A Polyak, B Bolte, TA Nguyen, ... Transactions of the Association for Computational Linguistics 9, 1336-1354, 2021 | 267 | 2021 |
Cloze-driven pretraining of self-attention networks A Baevski, S Edunov, Y Liu, L Zettlemoyer, M Auli arXiv preprint arXiv:1903.07785, 2019 | 261 | 2019 |
Effectiveness of self-supervised pre-training for speech recognition A Baevski, M Auli, A Mohamed arXiv preprint arXiv:1911.03912, 2019 | 241* | 2019 |
Robust wav2vec 2.0: Analyzing domain shift in self-supervised pre-training WN Hsu, A Sriram, A Baevski, T Likhomanenko, Q Xu, V Pratap, J Kahn, ... arXiv preprint arXiv:2104.01027, 2021 | 225 | 2021 |
Self-training and pre-training are complementary for speech recognition Q Xu, A Baevski, T Likhomanenko, P Tomasello, A Conneau, R Collobert, ... ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 168 | 2021 |
Masked autoencoders that listen PY Huang, H Xu, J Li, A Baevski, M Auli, W Galuba, F Metze, ... Advances in Neural Information Processing Systems 35, 28708-28720, 2022 | 164 | 2022 |
Pre-trained language model representations for language generation S Edunov, A Baevski, M Auli arXiv preprint arXiv:1903.09722, 2019 | 160 | 2019 |
Scaling speech technology to 1,000+ languages V Pratap, A Tjandra, B Shi, P Tomasello, A Babu, S Kundu, A Elkahky, ... Journal of Machine Learning Research 25 (97), 1-52, 2024 | 158 | 2024 |
Multilingual speech translation with efficient finetuning of pretrained models X Li, C Wang, Y Tang, C Tran, Y Tang, J Pino, A Baevski, A Conneau, ... arXiv preprint arXiv:2010.12829, 2020 | 132 | 2020 |