Gpipe: Efficient training of giant neural networks using pipeline parallelism Y Huang, Y Cheng, A Bapna, O Firat, D Chen, M Chen, HJ Lee, J Ngiam, ... Advances in neural information processing systems 32, 2019 | 1523 | 2019 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 843 | 2023 |
The best of both worlds: Combining recent advances in neural machine translation MX Chen, O Firat, A Bapna, M Johnson, W Macherey, G Foster, L Jones, ... arXiv preprint arXiv:1804.09849, 2018 | 514 | 2018 |
Simple, scalable adaptation for neural machine translation A Bapna, N Arivazhagan, O Firat arXiv preprint arXiv:1909.08478, 2019 | 409 | 2019 |
Massively multilingual neural machine translation in the wild: Findings and challenges N Arivazhagan, A Bapna, O Firat, D Lepikhin, M Johnson, M Krikun, ... arXiv preprint arXiv:1907.05019, 2019 | 387 | 2019 |
Building a conversational agent overnight with dialogue self-play P Shah, D Hakkani-Tür, G Tür, A Rastogi, A Bapna, N Nayak, L Heck arXiv preprint arXiv:1801.04871, 2018 | 220 | 2018 |
Lingvo: a modular and scalable framework for sequence-to-sequence modeling J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ... arXiv preprint arXiv:1902.08295, 2019 | 202 | 2019 |
Large-scale multilingual speech recognition with a streaming end-to-end model A Kannan, A Datta, TN Sainath, E Weinstein, B Ramabhadran, Y Wu, ... arXiv preprint arXiv:1909.05330, 2019 | 175 | 2019 |
Google usm: Scaling automatic speech recognition beyond 100 languages Y Zhang, W Han, J Qin, Y Wang, A Bapna, Z Chen, N Chen, B Li, ... arXiv preprint arXiv:2303.01037, 2023 | 151 | 2023 |
Towards zero-shot frame semantic parsing for domain scaling A Bapna, G Tur, D Hakkani-Tur, L Heck arXiv preprint arXiv:1707.02363, 2017 | 144 | 2017 |
Fleurs: Few-shot learning evaluation of universal representations of speech A Conneau, M Ma, S Khanuja, Y Zhang, V Axelrod, S Dalmia, J Riesa, ... 2022 IEEE Spoken Language Technology Workshop (SLT), 798-805, 2023 | 132 | 2023 |
Training deeper neural machine translation models with transparent attention A Bapna, MX Chen, O Firat, Y Cao, Y Wu arXiv preprint arXiv:1808.07561, 2018 | 123 | 2018 |
Revisiting character-based neural machine translation with capacity and compression C Cherry, G Foster, A Bapna, O Firat, W Macherey arXiv preprint arXiv:1808.09943, 2018 | 121 | 2018 |
Investigating multilingual NMT representations at scale SR Kudugunta, A Bapna, I Caswell, N Arivazhagan, O Firat arXiv preprint arXiv:1909.02197, 2019 | 114 | 2019 |
The missing ingredient in zero-shot neural machine translation N Arivazhagan, A Bapna, O Firat, R Aharoni, M Johnson, W Macherey arXiv preprint arXiv:1903.07091, 2019 | 105 | 2019 |
Quality at a glance: An audit of web-crawled multilingual datasets J Kreutzer, I Caswell, L Wang, A Wahab, D van Esch, N Ulzii-Orshikh, ... Transactions of the Association for Computational Linguistics 10, 50-72, 2022 | 98 | 2022 |
mslam: Massively multilingual joint pre-training for speech and text A Bapna, C Cherry, Y Zhang, Y Jia, M Johnson, Y Cheng, S Khanuja, ... arXiv preprint arXiv:2202.01374, 2022 | 93 | 2022 |
Audiopalm: A large language model that can speak and listen PK Rubenstein, C Asawaroengchai, DD Nguyen, A Bapna, Z Borsos, ... arXiv preprint arXiv:2306.12925, 2023 | 88 | 2023 |
Maestro: Matched speech text representations through modality matching Z Chen, Y Zhang, A Rosenberg, B Ramabhadran, P Moreno, A Bapna, ... arXiv preprint arXiv:2204.03409, 2022 | 83 | 2022 |
Share or not? learning to schedule language-specific capacity for multilingual translation B Zhang, A Bapna, R Sennrich, O Firat Ninth International Conference on Learning Representations 2021, 2021 | 83 | 2021 |