Scaling instruction-finetuned language models HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, Y Li, X Wang, ... Journal of Machine Learning Research 25 (70), 1-53, 2024 | 2144 | 2024 |
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... TMLR 2023, 2022 | 847 | 2022 |
Holistic evaluation of language models P Liang, R Bommasani, T Lee, D Tsipras, D Soylu, M Yasunaga, Y Zhang, ... TMLR 2023, 2022 | 757 | 2022 |
Challenging big-bench tasks and whether chain-of-thought can solve them M Suzgun, N Scales, N Schärli, S Gehrmann, Y Tay, HW Chung, ... ACL 2023 (Findings), 2022 | 407* | 2022 |
Language models are multilingual chain-of-thought reasoners F Shi, M Suzgun, M Freitag, X Wang, S Srivats, S Vosoughi, HW Chung, ... ICLR 2023, 2022 | 164 | 2022 |
LSTM Networks Can Perform Dynamic Counting M Suzgun, S Gehrmann, Y Belinkov, SM Shieber ACL 2019 Workshop on Deep Learning and Formal Languages, 2019 | 72 | 2019 |
Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study T Zack, E Lehman, M Suzgun, JA Rodriguez, LA Celi, J Gichoya, ... The Lancet Digital Health 6 (1), e12-e22, 2024 | 70 | 2024 |
Do Language Models Know When They're Hallucinating References? A Agrawal, M Suzgun, L Mackey, AT Kalai EACL 2024, 2023 | 59 | 2023 |
Prompt-and-rerank: A method for zero-shot and few-shot arbitrary textual style transfer with small language models M Suzgun, L Melas-Kyriazi, D Jurafsky EMNLP 2022, 2022 | 46 | 2022 |
Safety-tuned llamas: Lessons from improving the safety of large language models that follow instructions F Bianchi, M Suzgun, G Attanasio, P Röttger, D Jurafsky, T Hashimoto, ... ICLR 2024, 2023 | 45 | 2023 |
On Evaluating the Generalization of LSTM Models in Formal Languages M Suzgun, Y Belinkov, SM Shieber SCiL 2019, 2018 | 42 | 2018 |
Memory-Augmented Recurrent Neural Networks Can Learn Generalized Dyck Languages M Suzgun, S Gehrmann, Y Belinkov, SM Shieber arXiv preprint arXiv:1911.03329, 2019 | 39 | 2019 |
When Do Pre-Training Biases Propagate to Downstream Tasks? A Case Study in Text Summarization F Ladhak, E Durmus, M Suzgun, T Zhang, D Jurafsky, K Mckeown, ... EACL 2023, 2023 | 30 | 2023 |
HospiSign: An Interactive Sign Language Platform for Hearing Impaired M Suzgun, H Ozdemir, N Camgoz, A Kindiroglu, D Basaran, C Togay, ... Journal of Naval Sciences and Engineering 11 (3), 75-92, 2015 | 30 | 2015 |
Follow the wisdom of the crowd: Effective text generation via minimum bayes risk decoding M Suzgun, L Melas-Kyriazi, D Jurafsky ACL 2023, 2022 | 23 | 2022 |
Large legal fictions: Profiling legal hallucinations in large language models M Dahl, V Magesh, M Suzgun, DE Ho Journal of Legal Analysis 16 (1), 64-93, 2024 | 21 | 2024 |
Meta-prompting: Enhancing language models with task-agnostic scaffolding M Suzgun, AT Kalai arXiv preprint arXiv:2401.12954, 2024 | 15 | 2024 |
A Benchmark for Learning to Translate a New Language from One Grammar Book G Tanzer, M Suzgun, E Visser, D Jurafsky, L Melas-Kyriazi arXiv preprint arXiv:2309.16575, 2023 | 14 | 2023 |
Coding Inequity: Assessing GPT-4's Potential for Perpetuating Racial and Gender Biases in Healthcare T Zack, E Lehman, M Suzgun, JA Rodriguez, LA Celi, J Gichoya, ... medRxiv, 2023.07. 13.23292577, 2023 | 14 | 2023 |
The Harvard USPTO Patent Dataset: A Large-Scale, Well-Structured, and Multi-Purpose Corpus of Patent Applications M Suzgun, L Melas-Kyriazi, SK Sarkar, SD Kominers, SM Shieber NeurIPS 2023 (Datasets and Benchmarks Track), 2022 | 14 | 2022 |