An empirical survey of data augmentation for limited data learning in nlp

Q Miao, Y Lv, M Huang, X Wang… - IEEE/CAA Journal of …, 2023 - ieeexplore.ieee.org

The virtual-to-real paradigm, ie, training models on virtual data and then applying them to
solve real-world problems, has attracted more and more attention from various domains by …

被引用次数：59 相关文章所有 3 个版本

[PDF] arxiv.org

A survey on data selection for language models

A Albalak, Y Elazar, SM Xie, S Longpre… - arXiv preprint arXiv …, 2024 - arxiv.org

A major factor in the recent success of large language models is the use of enormous and
ever-growing text datasets for unsupervised pre-training. However, naively training a model …

被引用次数：24 相关文章所有 2 个版本

[PDF] arxiv.org

Nl-augmenter: A framework for task-sensitive natural language augmentation

KD Dhole, V Gangal, S Gehrmann, A Gupta, Z Li… - arXiv preprint arXiv …, 2021 - arxiv.org

Data augmentation is an important component in the robustness evaluation of models in
natural language processing (NLP) and in enhancing the diversity of the data they are …

被引用次数：67 相关文章所有 5 个版本

[PDF] mdpi.com

A survey on gan techniques for data augmentation to address the imbalanced data issues in credit card fraud detection

E Strelcenia, S Prakoonwit - Machine Learning and Knowledge Extraction, 2023 - mdpi.com

Data augmentation is an important procedure in deep learning. GAN-based data
augmentation can be utilized in many domains. For instance, in the credit card fraud domain …

被引用次数：27 相关文章所有 4 个版本

[PDF] arxiv.org

Large language models as annotators: Enhancing generalization of nlp models at minimal cost

P Bansal, A Sharma - arXiv preprint arXiv:2306.15766, 2023 - arxiv.org

State-of-the-art supervised NLP models achieve high accuracy but are also susceptible to
failures on inputs from low-data regimes, such as domains that are not represented in …

被引用次数：27 相关文章所有 2 个版本

[PDF] arxiv.org

Data augmentation using llms: Data perspectives, learning paradigms and challenges

B Ding, C Qin, R Zhao, T Luo, X Li, G Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

In the rapidly evolving field of machine learning (ML), data augmentation (DA) has emerged
as a pivotal technique for enhancing model performance by diversifying training examples …

被引用次数：18 相关文章所有 3 个版本

[PDF] arxiv.org

Docogen: Domain counterfactual generation for low resource domain adaptation

N Calderon, E Ben-David, A Feder… - arXiv preprint arXiv …, 2022 - arxiv.org

Natural language processing (NLP) algorithms have become very successful, but they still
struggle when applied to out-of-distribution examples. In this paper we propose a …

被引用次数：36 相关文章所有 4 个版本

[PDF] aaai.org

Alp: Data augmentation using lexicalized pcfgs for few-shot text classification

HH Kim, D Woo, SJ Oh, JW Cha, YS Han - Proceedings of the aaai …, 2022 - ojs.aaai.org

Data augmentation has been an important ingredient for boosting performances of learned
models. Prior data augmentation methods for few-shot text classification have led to great …

被引用次数：34 相关文章所有 8 个版本

[PDF] mit.edu

To augment or not to augment? A comparative study on text augmentation techniques for low-resource NLP

GG Şahin - Computational Linguistics, 2022 - direct.mit.edu

Data-hungry deep neural networks have established themselves as the de facto standard for
many NLP tasks, including the traditional sequence tagging ones. Despite their state-of-the …

被引用次数：31 相关文章所有 4 个版本

[PDF] arxiv.org

Text autoaugment: Learning compositional augmentation policy for text classification

S Ren, J Zhang, L Li, X Sun, J Zhou - arXiv preprint arXiv:2109.00523, 2021 - arxiv.org

Data augmentation aims to enrich training samples for alleviating the overfitting issue in low-
resource or class-imbalanced situations. Traditional methods first devise task-specific …

被引用次数：34 相关文章所有 5 个版本

高级搜索

QQ 群