State-of-the-art generalisation research in NLP: a taxonomy and review

D Hupkes, M Giulianelli, V Dankers, M Artetxe… - arXiv preprint arXiv …, 2022 - arxiv.org
The ability to generalise well is one of the primary desiderata of natural language
processing (NLP). Yet, what'good generalisation'entails and how it should be evaluated is …

Towards debiasing NLU models from unknown biases

PA Utama, NS Moosavi, I Gurevych - arXiv preprint arXiv:2009.12303, 2020 - arxiv.org
NLU models often exploit biases to achieve high dataset-specific performance without
properly learning the intended task. Recently proposed debiasing methods are shown to be …

A survey on measuring and mitigating reasoning shortcuts in machine reading comprehension

X Ho, JM Meissner, S Sugawara, A Aizawa - arXiv preprint arXiv …, 2022 - arxiv.org
The issue of shortcut learning is widely known in NLP and has been an important research
focus in recent years. Unintended correlations in the data enable models to easily solve …

Generalized but not robust? comparing the effects of data modification methods on out-of-domain generalization and adversarial robustness

T Gokhale, S Mishra, M Luo, BS Sachdeva… - arXiv preprint arXiv …, 2022 - arxiv.org
Data modification, either via additional training datasets, data augmentation, debiasing, and
dataset filtering, has been proposed as an effective solution for generalizing to out-of …

Quantifying and attributing the hallucination of large language models via association analysis

L Du, Y Wang, X Xing, Y Ya, X Li, X Jiang… - arXiv preprint arXiv …, 2023 - arxiv.org
Although demonstrating superb performance on various NLP tasks, large language models
(LLMs) still suffer from the hallucination problem, which threatens the reliability of LLMs. To …

Which shortcut solution do question answering models prefer to learn?

K Shinoda, S Sugawara, A Aizawa - … of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org
Question answering (QA) models for reading comprehension tend to exploit spurious
correlations in training sets and thus learn shortcut solutions rather than the solutions …

An empirical study on model-agnostic debiasing strategies for robust natural language inference

T Liu, X Zheng, X Ding, B Chang, Z Sui - arXiv preprint arXiv:2010.03777, 2020 - arxiv.org
The prior work on natural language inference (NLI) debiasing mainly targets at one or few
known biases while not necessarily making the models more robust. In this paper, we focus …

Smoa: Sparse mixture of adapters to mitigate multiple dataset biases

Y Liu, J Yan, Y Chen, J Liu, H Wu - arXiv preprint arXiv:2302.14413, 2023 - arxiv.org
Recent studies reveal that various biases exist in different NLP tasks, and over-reliance on
biases results in models' poor generalization ability and low adversarial robustness. To …

Methods for Estimating and improving robustness of language models

M Štefánik - arXiv preprint arXiv:2206.08446, 2022 - arxiv.org
Despite their outstanding performance, large language models (LLMs) suffer notorious flaws
related to their preference for simple, surface-level textual relations over full semantic …

Coreference reasoning in machine reading comprehension

M Wu, NS Moosavi, D Roth, I Gurevych - arXiv preprint arXiv:2012.15573, 2020 - arxiv.org
Coreference resolution is essential for natural language understanding and has been long
studied in NLP. In recent years, as the format of Question Answering (QA) became a …