Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1280 | 2023 |
Using the Output Embedding to Improve Language Models O Press, L Wolf EACL 2017, 2017 | 752 | 2017 |
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation O Press, NA Smith, M Lewis ICLR 2022, 2021 | 407 | 2021 |
Measuring and narrowing the compositionality gap in language models O Press, M Zhang, S Min, L Schmidt, NA Smith, M Lewis Findings of EMNLP 2023, 2022 | 277* | 2022 |
How language model hallucinations can snowball M Zhang, O Press, W Merrill, A Liu, NA Smith arXiv preprint arXiv:2305.13534, 2023 | 153 | 2023 |
Language Generation with Recurrent Generative Adversarial Networks without Pre-training O Press, A Bar, B Bogin, J Berant, L Wolf 1st Workshop on Learning to Generate Natural Language at ICML 2017, 2017 | 137 | 2017 |
What Language Model to Train if You Have One Million GPU Hours? T Le Scao, T Wang, D Hesslow, L Saulnier, S Bekman, MS Bari, ... Findings of EMNLP 2022, 2022 | 80 | 2022 |
SWE-bench: Can Language Models Resolve Real-World GitHub Issues? CE Jimenez, J Yang, A Wettig, S Yao, K Pei, O Press, K Narasimhan ICLR 2024, 2023 | 77 | 2023 |
Improving Transformer Models by Reordering their Sublayers O Press, NA Smith, O Levy ACL 2020, 2019 | 73 | 2019 |
Shortformer: Better Language Modeling using Shorter Inputs O Press, NA Smith, M Lewis ACL 2021, 2020 | 69 | 2020 |
Transformer Language Models without Positional Encodings Still Learn Positional Information A Haviv, O Ram, O Press, P Izsak, O Levy Findings of EMNLP 2022, 2022 | 62 | 2022 |
You may not need attention O Press, NA Smith arXiv preprint arXiv:1810.13409, 2018 | 28 | 2018 |
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering J Yang, CE Jimenez, A Wettig, K Lieret, S Yao, K Narasimhan, O Press | 5 | 2024 |
Partially shuffling the training data to improve language models O Press arXiv preprint arXiv:1903.04167, 2019 | 4 | 2019 |
Complementing Scale: Novel Guidance Methods for Improving Language Models O Press University of Washington, 2023 | | 2023 |