Palm: Scaling language modeling with pathways A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... Journal of Machine Learning Research 24 (240), 1-113, 2023 | 3864 | 2023 |
Generating sentences from a continuous space SR Bowman, L Vilnis, O Vinyals, AM Dai, R Jozefowicz, S Bengio Proceedings of the 20th SIGNLL Conference on Computational Natural Language …, 2016 | 2735 | 2016 |
Natural questions: a benchmark for question answering research T Kwiatkowski, J Palomaki, O Redfield, M Collins, A Parikh, C Alberti, ... Transactions of the Association for Computational Linguistics 7, 453-466, 2019 | 2341 | 2019 |
Finetuned language models are zero-shot learners J Wei, M Bosma, VY Zhao, K Guu, AW Yu, B Lester, N Du, AM Dai, QV Le arXiv preprint arXiv:2109.01652, 2021 | 2324 | 2021 |
Scalable and accurate deep learning with electronic health records A Rajkomar, E Oren, K Chen, AM Dai, N Hajaj, M Hardt, PJ Liu, X Liu, ... NPJ digital medicine 1 (1), 1-10, 2018 | 2130 | 2018 |
Scaling instruction-finetuned language models HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, Y Li, X Wang, ... Journal of Machine Learning Research 25 (70), 1-53, 2024 | 2005 | 2024 |
HyperNetworks D Ha, A Dai, QV Le Proceedings of the International Conference on Learning Representations, 2017 | 1587 | 2017 |
Semi-supervised sequence learning AM Dai, QV Le Advances in neural information processing systems 28, 2015 | 1582 | 2015 |
Adversarial Training Methods for Semi-Supervised Text Classification T Miyato, AM Dai, I Goodfellow Proceedings of the International Conference on Learning Representations, 2017 | 1269 | 2017 |
Palm 2 technical report R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ... arXiv preprint arXiv:2305.10403, 2023 | 995 | 2023 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 865 | 2023 |
Music transformer CZA Huang, A Vaswani, J Uszkoreit, N Shazeer, I Simon, C Hawthorne, ... arXiv preprint arXiv:1809.04281, 2018 | 849 | 2018 |
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 831 | 2022 |
Maskgan: better text generation via filling in the_ W Fedus, I Goodfellow, AM Dai arXiv preprint arXiv:1801.07736, 2018 | 610 | 2018 |
Document embedding with paragraph vectors AM Dai, C Olah, QV Le NIPS 2014 Deep learning workshop, 2015 | 562 | 2015 |
Glam: Efficient scaling of language models with mixture-of-experts N Du, Y Huang, AM Dai, S Tong, D Lepikhin, Y Xu, M Krikun, Y Zhou, ... International Conference on Machine Learning, 5547-5569, 2022 | 397 | 2022 |
Many paths to equilibrium: GANs do not need to decrease a divergence at every step W Fedus, M Rosca, B Lakshminarayanan, AM Dai, S Mohamed, ... arXiv preprint arXiv:1710.08446, 2017 | 251 | 2017 |
Gmail smart compose: Real-time assisted writing MX Chen, BN Lee, G Bansal, Y Cao, S Zhang, J Lu, J Tsay, Y Wang, ... Proceedings of the 25th ACM SIGKDD International Conference on Knowledge …, 2019 | 226 | 2019 |
Who said what: Modeling individual labelers improves classification M Guan, V Gulshan, A Dai, G Hinton Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018 | 226 | 2018 |
Learning longer-term dependencies in rnns with auxiliary losses T Trinh, A Dai, T Luong, Q Le International Conference on Machine Learning, 4965-4974, 2018 | 221 | 2018 |