Roberta: A robustly optimized bert pretraining approach Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, ... arXiv preprint arXiv:1907.11692, 2019 | 25387* | 2019 |
Deep contextualized word representations ME Peters, M Neumann, M Iyyer, M Gardner, C Clark, K Lee, ... NAACL, 2018 | 15364* | 2018 |
Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension M Lewis, Y Liu, N Goyal, M Ghazvininejad, A Mohamed, O Levy, ... arXiv preprint arXiv:1910.13461, 2019 | 9637 | 2019 |
Unsupervised cross-lingual representation learning at scale A Conneau, K Khandelwal, N Goyal, V Chaudhary, G Wenzek, F Guzmán, ... arXiv preprint arXiv:1911.02116, 2019 | 5563 | 2019 |
Opt: Open pre-trained transformer language models S Zhang, S Roller, N Goyal, M Artetxe, M Chen, S Chen, C Dewan, ... arXiv preprint arXiv:2205.01068, 2022 | 2515* | 2022 |
Spanbert: Improving pre-training by representing and predicting spans M Joshi, D Chen, Y Liu, DS Weld, L Zettlemoyer, O Levy Transactions of the association for computational linguistics 8, 64-77, 2020 | 2053 | 2020 |
Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension M Joshi, E Choi, DS Weld, L Zettlemoyer arXiv preprint arXiv:1705.03551, 2017 | 1884 | 2017 |
Multilingual denoising pre-training for neural machine translation Y Liu, J Gu, N Goyal, X Li, S Edunov, M Ghazvininejad, M Lewis, ... Transactions of the Association for Computational Linguistics 8, 726-742, 2020 | 1595 | 2020 |
Allennlp: A deep semantic natural language processing platform M Gardner, J Grus, M Neumann, O Tafjord, P Dasigi, N Liu, M Peters, ... arXiv preprint arXiv:1803.07640, 2018 | 1378 | 2018 |
Knowledge-based weak supervision for information extraction of overlapping relations R Hoffmann, C Zhang, X Ling, L Zettlemoyer, DS Weld Proceedings of the 49th annual meeting of the association for computational …, 2011 | 1215 | 2011 |
Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars LS Zettlemoyer, M Collins Conference on Uncertainty in Artificial Intelligence (UAI), 2005 | 1134* | 2005 |
End-to-end neural coreference resolution K Lee, L He, M Lewis, L Zettlemoyer arXiv preprint arXiv:1707.07045, 2017 | 1097 | 2017 |
Qlora: Efficient finetuning of quantized llms T Dettmers, A Pagnoni, A Holtzman, L Zettlemoyer Advances in Neural Information Processing Systems 36, 2024 | 913 | 2024 |
Toolformer: Language models can teach themselves to use tools T Schick, J Dwivedi-Yu, R Dessì, R Raileanu, M Lomeli, E Hambro, ... Advances in Neural Information Processing Systems 36, 2024 | 891 | 2024 |
Rethinking the role of demonstrations: What makes in-context learning work? S Min, X Lyu, A Holtzman, M Artetxe, M Lewis, H Hajishirzi, L Zettlemoyer arXiv preprint arXiv:2202.12837, 2022 | 874 | 2022 |
QuAC: Question answering in context E Choi, H He, M Iyyer, M Yatskar, W Yih, Y Choi, P Liang, L Zettlemoyer arXiv preprint arXiv:1808.07036, 2018 | 872 | 2018 |
Summarizing source code using a neural attention model S Iyer, I Konstas, A Cheung, L Zettlemoyer 54th Annual Meeting of the Association for Computational Linguistics 2016 …, 2016 | 806 | 2016 |
Adversarial example generation with syntactically controlled paraphrase networks M Iyyer, J Wieting, K Gimpel, L Zettlemoyer arXiv preprint arXiv:1804.06059, 2018 | 730 | 2018 |
Generalization through memorization: Nearest neighbor language models U Khandelwal, O Levy, D Jurafsky, L Zettlemoyer, M Lewis arXiv preprint arXiv:1911.00172, 2019 | 651 | 2019 |
Alfred: A benchmark for interpreting grounded instructions for everyday tasks M Shridhar, J Thomason, D Gordon, Y Bisk, W Han, R Mottaghi, ... Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020 | 631 | 2020 |