Nl-augmenter: A framework for task-sensitive natural language augmentation

S Gehrmann, E Clark, T Sellam - Journal of Artificial Intelligence Research, 2023 - jair.org

Abstract Evaluation practices in natural language generation (NLG) have many known flaws,
but improved evaluation approaches are rarely widely adopted. This issue has become …

被引用次数：118 相关文章所有 6 个版本

[PDF] arxiv.org

A survey on data selection for language models

A Albalak, Y Elazar, SM Xie, S Longpre… - arXiv preprint arXiv …, 2024 - arxiv.org

A major factor in the recent success of large language models is the use of enormous and
ever-growing text datasets for unsupervised pre-training. However, naively training a model …

被引用次数：15 相关文章所有 2 个版本

[PDF] arxiv.org

Crosslingual generalization through multitask finetuning

N Muennighoff, T Wang, L Sutawika, A Roberts… - arXiv preprint arXiv …, 2022 - arxiv.org

Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …

被引用次数：437 相关文章所有 5 个版本

[PDF] arxiv.org

Octopack: Instruction tuning code large language models

N Muennighoff, Q Liu, A Zebaze, Q Zheng… - arXiv preprint arXiv …, 2023 - arxiv.org

Finetuning large language models (LLMs) on instructions leads to vast performance
improvements on natural language tasks. We apply instruction tuning using code …

被引用次数：84 相关文章所有 4 个版本

[PDF] arxiv.org

Red teaming chatgpt via jailbreaking: Bias, robustness, reliability and toxicity

TY Zhuo, Y Huang, C Chen, Z Xing - arXiv preprint arXiv:2301.12867, 2023 - arxiv.org

Recent breakthroughs in natural language processing (NLP) have permitted the synthesis
and comprehension of coherent text in an open-ended way, therefore translating the …

被引用次数：80 相关文章所有 2 个版本

[PDF] arxiv.org

Measure and improve robustness in NLP models: A survey

X Wang, H Wang, D Yang - arXiv preprint arXiv:2112.08313, 2021 - arxiv.org

As NLP models achieved state-of-the-art performances over benchmarks and gained wide
applications, it has been increasingly important to ensure the safe deployment of these …

被引用次数：100 相关文章所有 7 个版本

[PDF] thecvf.com

Hrs-bench: Holistic, reliable and scalable benchmark for text-to-image models

EM Bakr, P Sun, X Shen, FF Khan… - Proceedings of the …, 2023 - openaccess.thecvf.com

Designing robust text-to-image (T2I) models have been extensively explored in recent years,
especially with the emergence of diffusion models, which achieves state-of-the-art results on …

被引用次数：33 相关文章所有 7 个版本

[PDF] arxiv.org

NLPositionality: Characterizing design biases of datasets and models

S Santy, JT Liang, RL Bras, K Reinecke… - arXiv preprint arXiv …, 2023 - arxiv.org

Design biases in NLP systems, such as performance differences for different populations,
often stem from their creator's positionality, ie, views and lived experiences shaped by …

被引用次数：45 相关文章所有 9 个版本

[PDF] arxiv.org

The state of human-centered NLP technology for fact-checking

A Das, H Liu, V Kovatchev, M Lease - Information processing & …, 2023 - Elsevier

Misinformation threatens modern society by promoting distrust in science, changing
narratives in public health, heightening social polarization, and disrupting democratic …

被引用次数：35 相关文章所有 6 个版本

[PDF] arxiv.org

Aya model: An instruction finetuned open-access multilingual language model

A Üstün, V Aryabumi, ZX Yong, WY Ko… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent breakthroughs in large language models (LLMs) have centered around a handful of
data-rich languages. What does it take to broaden access to breakthroughs beyond first …

被引用次数：27 相关文章所有 3 个版本

高级搜索

QQ 群