A broad-coverage challenge corpus for sentence understanding through inference A Williams, N Nangia, SR Bowman arXiv preprint arXiv:1704.05426, 2017 | 4279 | 2017 |
Superglue: A stickier benchmark for general-purpose language understanding systems A Wang, Y Pruksachatkun, N Nangia, A Singh, J Michael, F Hill, O Levy, ... Advances in neural information processing systems 32, 2019 | 2029 | 2019 |
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 828 | 2022 |
CrowS-pairs: A challenge dataset for measuring social biases in masked language models N Nangia, C Vania, R Bhalerao, SR Bowman arXiv preprint arXiv:2010.00133, 2020 | 465 | 2020 |
BBQ: A hand-built bias benchmark for question answering A Parrish, A Chen, N Nangia, V Padmakumar, J Phang, J Thompson, ... arXiv preprint arXiv:2110.08193, 2021 | 156 | 2021 |
Listops: A diagnostic dataset for latent tree learning N Nangia, SR Bowman arXiv preprint arXiv:1804.06028, 2018 | 123 | 2018 |
The repeval 2017 shared task: Multi-genre natural language inference with sentence representations N Nangia, A Williams, A Lazaridou, SR Bowman arXiv preprint arXiv:1707.08172, 2017 | 109 | 2017 |
Human vs. muppet: A conservative estimate of human performance on the GLUE benchmark N Nangia, SR Bowman arXiv preprint arXiv:1905.10425, 2019 | 106 | 2019 |
QuALITY: Question answering with long input texts, yes! RY Pang, A Parrish, N Joshi, N Nangia, J Phang, A Chen, V Padmakumar, ... arXiv preprint arXiv:2112.08608, 2021 | 66 | 2021 |
jiant 1.2: A software toolkit for research on general-purpose text understanding models A Wang, IF Tenney, Y Pruksachatkun, K Yu, J Hula, P Xia, R Pappagari, ... Note: http://jiant. info/Cited by: footnote 4, 2019 | 52 | 2019 |
Does putting a linguist in the loop improve NLU data collection? A Parrish, W Huang, O Agha, SH Lee, N Nangia, A Warstadt, K Aggarwal, ... arXiv preprint arXiv:2104.07179, 2021 | 30 | 2021 |
What ingredients make for an effective crowdsourcing protocol for difficult NLU data collection tasks? N Nangia, S Sugawara, H Trivedi, A Warstadt, C Vania, SR Bowman arXiv preprint arXiv:2106.00794, 2021 | 29 | 2021 |
What do nlp researchers believe? results of the nlp community metasurvey J Michael, A Holtzman, A Parrish, A Mueller, A Wang, A Chen, D Madaan, ... arXiv preprint arXiv:2208.12852, 2022 | 21 | 2022 |
The multi-genre nli corpus A Williams, N Nangia, SR Bowman | 15 | 2018 |
A broad-coverage challenge corpus for sentence understanding through inference. arXiv 2017 A Williams, N Nangia, SR Bowman arXiv preprint arXiv:1704.05426, 0 | 15 | |
Single-turn debate does not help humans answer hard reading-comprehension questions A Parrish, H Trivedi, E Perez, A Chen, N Nangia, J Phang, SR Bowman arXiv preprint arXiv:2204.05212, 2022 | 11 | 2022 |
What Makes Reading Comprehension Questions Difficult? S Sugawara, N Nangia, A Warstadt, SR Bowman arXiv preprint arXiv:2203.06342, 2022 | 10 | 2022 |
Crowdsourcing beyond annotation: Case studies in benchmark data collection A Suhr, C Vania, N Nangia, M Sap, M Yatskar, S Bowman, Y Artzi Proceedings of the 2021 Conference on Empirical Methods in Natural Language …, 2021 | 8 | 2021 |
Discrete latent structure in neural networks V Niculae, CF Corro, N Nangia, T Mihaylova, AFT Martins arXiv preprint arXiv:2301.07473, 2023 | 7 | 2023 |
Latent structure models for natural language processing AFT Martins, T Mihaylova, N Nangia, V Niculae Proceedings of the 57th Annual Meeting of the Association for Computational …, 2019 | 6 | 2019 |