Generalization through memorization: Nearest neighbor language models

Y Gao, Y Xiong, X Gao, K Jia, J Pan, Y Bi, Y Dai… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) demonstrate powerful capabilities, but they still face
challenges in practical applications, such as hallucinations, slow knowledge updates, and …

被引用次数：299 相关文章所有 4 个版本

[HTML] sciencedirect.com

[HTML][HTML] Data augmentation approaches in natural language processing: A survey

B Li, Y Hou, W Che - Ai Open, 2022 - Elsevier

As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where
deep learning techniques may fail. It is widely applied in computer vision then introduced to …

被引用次数：243 相关文章所有 5 个版本

[PDF] mit.edu

In-context retrieval-augmented language models

O Ram, Y Levine, I Dalmedigos, D Muhlgay… - Transactions of the …, 2023 - direct.mit.edu

Abstract Retrieval-Augmented Language Modeling (RALM) methods, which condition a
language model (LM) on relevant documents from a grounding corpus during generation …

被引用次数：256 相关文章所有 7 个版本

[PDF] acm.org

Taxonomy of risks posed by language models

L Weidinger, J Uesato, M Rauh, C Griffin… - Proceedings of the …, 2022 - dl.acm.org

Responsible innovation on large-scale Language Models (LMs) requires foresight into and
in-depth understanding of the risks these models may pose. This paper develops a …

被引用次数：383 相关文章所有 7 个版本

[PDF] arxiv.org

Augmented language models: a survey

G Mialon, R Dessì, M Lomeli, C Nalmpantis… - arXiv preprint arXiv …, 2023 - arxiv.org

This survey reviews works in which language models (LMs) are augmented with reasoning
skills and the ability to use tools. The former is defined as decomposing a potentially …

被引用次数：353 相关文章所有 3 个版本

[PDF] neurips.cc

Leandojo: Theorem proving with retrieval-augmented language models

K Yang, A Swope, A Gu, R Chalamala… - Advances in …, 2024 - proceedings.neurips.cc

Large language models (LLMs) have shown promise in proving formal theorems using proof
assistants such as Lean. However, existing methods are difficult to reproduce or build on …

被引用次数：101 相关文章所有 9 个版本

[PDF] mlr.press

Memory-based model editing at scale

E Mitchell, C Lin, A Bosselut… - International …, 2022 - proceedings.mlr.press

Even the largest neural networks make errors, and once-correct predictions can become
invalid as the world changes. Model editors make local updates to the behavior of base (pre …

被引用次数：196 相关文章所有 6 个版本

[PDF] arxiv.org

Lamda: Language models for dialog applications

R Thoppilan, D De Freitas, J Hall, N Shazeer… - arXiv preprint arXiv …, 2022 - arxiv.org

We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of
Transformer-based neural language models specialized for dialog, which have up to 137B …

被引用次数：1247 相关文章所有 6 个版本

[PDF] arxiv.org

Tip-adapter: Training-free adaption of clip for few-shot classification

R Zhang, W Zhang, R Fang, P Gao, K Li, J Dai… - European conference on …, 2022 - Springer

Abstract Contrastive Vision-Language Pre-training, known as CLIP, has provided a new
paradigm for learning visual representations using large-scale image-text pairs. It shows …

被引用次数：188 相关文章所有 6 个版本

[PDF] arxiv.org

Text-to-image diffusion models in generative ai: A survey

C Zhang, C Zhang, M Zhang, IS Kweon - arXiv preprint arXiv:2303.07909, 2023 - arxiv.org

This survey reviews text-to-image diffusion models in the context that diffusion models have
emerged to be popular for a wide range of generative tasks. As a self-contained work, this …

被引用次数：197 相关文章所有 4 个版本

高级搜索

QQ 群