[HTML][HTML] Data augmentation approaches in natural language processing: A survey

B Li, Y Hou, W Che - Ai Open, 2022 - Elsevier
As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where
deep learning techniques may fail. It is widely applied in computer vision then introduced to …

A survey on data augmentation for text classification

M Bayer, MA Kaufhold, C Reuter - ACM Computing Surveys, 2022 - dl.acm.org
Data augmentation, the artificial creation of training data for machine learning by
transformations, is a widely studied research field across machine learning disciplines …

A survey of data augmentation approaches for NLP

SY Feng, V Gangal, J Wei, S Chandar… - arXiv preprint arXiv …, 2021 - arxiv.org
Data augmentation has recently seen increased interest in NLP due to more work in low-
resource domains, new tasks, and the popularity of large-scale neural networks that require …

Contrastive learning reduces hallucination in conversations

W Sun, Z Shi, S Gao, P Ren, M de Rijke… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Pre-trained language models (LMs) store knowledge in their parameters and can generate
informative responses when used in conversational systems. However, LMs suffer from the …

Nl-augmenter: A framework for task-sensitive natural language augmentation

KD Dhole, V Gangal, S Gehrmann, A Gupta, Z Li… - arXiv preprint arXiv …, 2021 - arxiv.org
Data augmentation is an important component in the robustness evaluation of models in
natural language processing (NLP) and in enhancing the diversity of the data they are …

Exploring new frontiers in agricultural nlp: Investigating the potential of large language models for food applications

S Rezayi, Z Liu, Z Wu, C Dhakal, B Ge… - … Transactions on Big …, 2024 - ieeexplore.ieee.org
This paper explores new frontiers in agricultural natural language processing (NLP) by
investigating the effectiveness of food-related text corpora for pretraining transformer-based …

Clinicalradiobert: Knowledge-infused few shot learning for clinical notes named entity recognition

S Rezayi, H Dai, Z Liu, Z Wu, A Hebbar… - … Workshop on Machine …, 2022 - Springer
Transformer based language models such as BERT have been widely applied to many
domains through model pretraining and fine tuning. However, in low-resource scenarios …

[PDF][PDF] AgriBERT: Knowledge-Infused Agricultural Language Models for Matching Food and Nutrition.

S Rezayi, Z Liu, Z Wu, C Dhakal, B Ge, C Zhen, T Liu… - IJCAI, 2022 - researchgate.net
Pretraining domain-specific language models remains an important challenge which limits
their applicability in various areas such as agriculture. This paper investigates the …

TreeMix: Compositional constituency-based data augmentation for natural language understanding

L Zhang, Z Yang, D Yang - arXiv preprint arXiv:2205.06153, 2022 - arxiv.org
Data augmentation is an effective approach to tackle over-fitting. Many previous works have
proposed different data augmentations strategies for NLP, such as noise injection, word …

Data augmentation for sentiment classification with semantic preservation and diversity

G Chao, J Liu, M Wang, D Chu - Knowledge-Based Systems, 2023 - Elsevier
Data augmentation is a commonly-used technique to avoid over-fitting in deep learning.
However, the mechanism behind effective data augmentation methods is unclear. To …