FETA: A benchmark for few-sample task transfer in open-domain dialogue

A Albalak, Y Elazar, SM Xie, S Longpre… - arXiv preprint arXiv …, 2024 - arxiv.org

A major factor in the recent success of large language models is the use of enormous and
ever-growing text datasets for unsupervised pre-training. However, naively training a model …

被引用次数：71 相关文章所有 2 个版本

[PDF] arxiv.org

Rwkv: Reinventing rnns for the transformer era

B Peng, E Alcaide, Q Anthony, A Albalak… - arXiv preprint arXiv …, 2023 - arxiv.org

Transformers have revolutionized almost all natural language processing (NLP) tasks but
suffer from memory and computational complexity that scales quadratically with sequence …

被引用次数：432 相关文章所有 9 个版本

[PDF] arxiv.org

The responsible foundation model development cheatsheet: A review of tools & resources

S Longpre, S Biderman, A Albalak… - arXiv preprint arXiv …, 2024 - arxiv.org

Foundation model development attracts a rapidly expanding body of contributors, scientists,
and applications. To help shape responsible development practices, we introduce the …

被引用次数：5 相关文章所有 3 个版本

[PDF] washington.edu

[PDF][PDF] Attentional mixtures of soft prompt tuning for parameter-efficient multi-task knowledge sharing

A Asai, M Salehi, ME Peters… - arXiv preprint arXiv …, 2022 - homes.cs.washington.edu

This work introduces ATTEMPT (ATTEntional Mixture of Prompt Tuning), a new modular,
multi-task, and parameterefficient language model (LM) tuning approach that combines …

被引用次数：34 相关文章

[PDF] neurips.cc

Improving few-shot generalization by exploring and exploiting auxiliary data

A Albalak, CA Raffel, WY Wang - Advances in Neural …, 2024 - proceedings.neurips.cc

Few-shot learning is valuable in many real-world applications, but learning a generalizable
model without overfitting to the few labeled datapoints is challenging. In this work, we focus …

被引用次数：9 相关文章所有 8 个版本

[PDF] arxiv.org

Lower bounds on the expressivity of recurrent neural language models

A Svete, F Nowak, AM Sahabdeen… - arXiv preprint arXiv …, 2024 - arxiv.org

The recent successes and spread of large neural language models (LMs) call for a thorough
understanding of their computational ability. Describing their computational abilities through …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

Leveraging machine-generated rationales to facilitate social meaning detection in conversations

R Dutt, Z Wu, K Shi, D Sheth, P Gupta… - arXiv preprint arXiv …, 2024 - arxiv.org

We present a generalizable classification approach that leverages Large Language Models
(LLMs) to facilitate the detection of implicitly encoded social meaning in conversations. We …

被引用次数：3 相关文章所有 5 个版本

[PDF] arxiv.org

Taskweb: Selecting better source tasks for multi-task nlp

J Kim, A Asai, G Ilharco, H Hajishirzi - arXiv preprint arXiv:2305.13256, 2023 - arxiv.org

Recent work in NLP has shown promising results in training models on large amounts of
tasks to achieve better generalization. However, it is not well-understood how tasks are …

被引用次数：9 相关文章所有 5 个版本

[PDF] arxiv.org

D-REX: Dialogue relation extraction with explanations

A Albalak, V Embar, YL Tuan, L Getoor… - arXiv preprint arXiv …, 2021 - arxiv.org

Existing research studies on cross-sentence relation extraction in long-form multi-party
conversations aim to improve relation extraction without considering the explainability of …

被引用次数：13 相关文章所有 7 个版本

[PDF] arxiv.org

Investigating the Robustness of Modelling Decisions for Few-Shot Cross-Topic Stance Detection: A Preregistered Study

M Reuver, S Verberne, A Fokkens - arXiv preprint arXiv:2404.03987, 2024 - arxiv.org

For a viewpoint-diverse news recommender, identifying whether two news articles express
the same viewpoint is essential. One way to determine" same or different" viewpoint is …

被引用次数：3 相关文章所有 6 个版本

高级搜索

QQ 群