State-of-the-art generalisation research in NLP: a taxonomy and review

D Hupkes, M Giulianelli, V Dankers, M Artetxe… - arXiv preprint arXiv …, 2022 - arxiv.org
The ability to generalise well is one of the primary desiderata of natural language
processing (NLP). Yet, what'good generalisation'entails and how it should be evaluated is …

[HTML][HTML] Automated data processing and feature engineering for deep learning and big data applications: a survey

A Mumuni, F Mumuni - Journal of Information and Intelligence, 2024 - Elsevier
Modern approach to artificial intelligence (AI) aims to design algorithms that learn directly
from data. This approach has achieved impressive results and has contributed significantly …

A survey of automated data augmentation for image classification: Learning to compose, mix, and generate

TH Cheung, DY Yeung - IEEE transactions on neural networks …, 2023 - ieeexplore.ieee.org
Data augmentation is an effective way to improve the generalization of deep learning
models. However, the underlying augmentation methods mainly rely on handcrafted …

A survey of automated data augmentation algorithms for deep learning-based image classification tasks

Z Yang, RO Sinnott, J Bailey, Q Ke - Knowledge and Information Systems, 2023 - Springer
In recent years, one of the most popular techniques in the computer vision community has
been the deep learning technique. As a data-driven technique, deep model requires …

GDA: Generative data augmentation techniques for relation extraction tasks

X Hu, A Liu, Z Tan, X Zhang, C Zhang, I King… - arXiv preprint arXiv …, 2023 - arxiv.org
Relation extraction (RE) tasks show promising performance in extracting relations from two
entities mentioned in sentences, given sufficient annotations available during training. Such …

Towards better detection of biased language with scarce, noisy, and biased annotations

Z Li, Z Lu, M Yin - Proceedings of the 2022 AAAI/ACM Conference on AI …, 2022 - dl.acm.org
Biased language is prevalent in today's online social media. To reduce the amount of online
biased language, one critical first step is to accurately detect such biased language, ideally …

Boosting text augmentation via hybrid instance filtering framework

H Yang, K Li - Findings of the Association for Computational …, 2023 - aclanthology.org
Text augmentation is an effective technique for addressing the problem of insufficient data in
natural language processing. However, existing text augmentation methods tend to focus on …

What makes better augmentation strategies? augment difficult but not too different

J Kim, D Kang, S Ahn, J Shin - International Conference on Learning …, 2021 - openreview.net
The practice of data augmentation has been extensively used to boost the performance of
deep neural networks for various NLP tasks. It is more effective when only a limited number …

Application of generative adversarial networks and Shapley algorithm based on easy data augmentation for imbalanced text data

JL Wu, S Huang - Applied Sciences, 2022 - mdpi.com
Imbalanced data constitute an extensively studied problem in the field of machine learning
classification because they result in poor training outcomes. Data augmentation is a method …

AugCSE: Contrastive sentence embedding with diverse augmentations

Z Tang, MY Kocyigit, D Wijaya - arXiv preprint arXiv:2210.13749, 2022 - arxiv.org
Data augmentation techniques have been proven useful in many applications in NLP fields.
Most augmentations are task-specific, and cannot be used as a general-purpose tool. In our …