Uncertainty in natural language processing: Sources, quantification, and applications

M Hu, Z Zhang, S Zhao, M Huang, B Wu - arXiv preprint arXiv:2306.04459, 2023 - arxiv.org
As a main field of artificial intelligence, natural language processing (NLP) has achieved
remarkable success via deep neural networks. Plenty of NLP tasks have been addressed in …

Uncertainty quantification with pre-trained language models: A large-scale empirical analysis

Y Xiao, PP Liang, U Bhatt, W Neiswanger… - arXiv preprint arXiv …, 2022 - arxiv.org
Pre-trained language models (PLMs) have gained increasing popularity due to their
compelling prediction performance in diverse natural language processing (NLP) tasks …

Retrieve-and-sample: Document-level event argument extraction via hybrid retrieval augmentation

Y Ren, Y Cao, P Guo, F Fang, W Ma… - Proceedings of the 61st …, 2023 - aclanthology.org
Recent studies have shown the effectiveness of retrieval augmentation in many generative
NLP tasks. These retrieval-augmented methods allow models to explicitly acquire prior …

Learning to generalize to more: Continuous semantic augmentation for neural machine translation

X Wei, H Yu, Y Hu, R Weng, W Luo, J Xie… - arXiv preprint arXiv …, 2022 - arxiv.org
The principal task in supervised neural machine translation (NMT) is to learn to generate
target sentences conditioned on the source inputs from a set of parallel sentence pairs, and …

Self-training sampling with monolingual data uncertainty for neural machine translation

W Jiao, X Wang, Z Tu, S Shi, MR Lyu, I King - arXiv preprint arXiv …, 2021 - arxiv.org
Self-training has proven effective for improving NMT performance by augmenting model
training with synthetic parallel data. The common practice is to construct synthetic data …

Rethinking data augmentation for low-resource neural machine translation: A multi-task learning approach

VM Sánchez-Cartagena, M Esplà-Gomis… - arXiv preprint arXiv …, 2021 - arxiv.org
In the context of neural machine translation, data augmentation (DA) techniques may be
used for generating additional training samples when the available parallel data are scarce …

Exploring predictive uncertainty and calibration in NLP: A study on the impact of method & data scarcity

D Ulmer, J Frellsen, C Hardmeier - arXiv preprint arXiv:2210.15452, 2022 - arxiv.org
We investigate the problem of determining the predictive confidence (or, conversely,
uncertainty) of a neural classifier through the lens of low-resource languages. By training …

Confidence-aware scheduled sampling for neural machine translation

Y Liu, F Meng, Y Chen, J Xu, J Zhou - arXiv preprint arXiv:2107.10427, 2021 - arxiv.org
Scheduled sampling is an effective method to alleviate the exposure bias problem of neural
machine translation. It simulates the inference scene by randomly replacing ground-truth …

Unsupervised learning of deterministic dialogue structure with edge-enhanced graph auto-encoder

Y Sun, Y Shan, C Tang, Y Hu, Y Dai, J Yu… - Proceedings of the …, 2021 - ojs.aaai.org
It is important for task-oriented dialogue systems to discover the dialogue structure (ie the
general dialogue flow) from dialogue corpora automatically. Previous work models dialogue …

SUN: Exploring intrinsic uncertainties in text-to-SQL parsers

B Qin, L Wang, B Hui, B Li, X Wei, B Li, F Huang… - arXiv preprint arXiv …, 2022 - arxiv.org
This paper aims to improve the performance of text-to-SQL parsing by exploring the intrinsic
uncertainties in the neural network based approaches (called SUN). From the data …