First tragedy, then parse: History repeats itself in the new era of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arXiv preprint arXiv …, 2024 - arxiv.org

This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

被引用次数：102 相关文章所有 3 个版本

[PDF] arxiv.org

Gated linear attention transformers with hardware-efficient training

S Yang, B Wang, Y Shen, R Panda, Y Kim - arXiv preprint arXiv …, 2023 - arxiv.org

Transformers with linear attention allow for efficient parallel training but can simultaneously
be formulated as an RNN with 2D (matrix-valued) hidden states, thus enjoying linear (with …

被引用次数：93 相关文章所有 4 个版本

[PDF] arxiv.org

Benchmarks as microscopes: A call for model metrology

M Saxon, A Holtzman, P West, WY Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

Modern language models (LMs) pose a new challenge in capability assessment. Static
benchmarks inevitably saturate without providing confidence in the deployment tolerances …

被引用次数：4 相关文章所有 4 个版本

[PDF] arxiv.org

LLM-rubric: A multidimensional, calibrated approach to automated evaluation of natural language texts

H Hashemi, J Eisner, C Rosset, B Van Durme… - arXiv preprint arXiv …, 2024 - arxiv.org

This paper introduces a framework for the automated evaluation of natural language texts. A
manually constructed rubric describes how to assess multiple dimensions of interest. To …

被引用次数：3 相关文章所有 4 个版本

[PDF] arxiv.org

What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language Models

S Zhu, JM Rzeszotarski - arXiv preprint arXiv:2407.01929, 2024 - arxiv.org

The term Language Models (LMs), as a time-specific collection of models of interest, is
constantly reinvented, with its referents updated much like the $\textit {Ship of Theseus} …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Evaluation and Continual Improvement for an Enterprise AI Assistant

AV Maharaj, K Qian, U Bhattacharya, S Fang… - arXiv preprint arXiv …, 2024 - arxiv.org

The development of conversational AI assistants is an iterative process with multiple
components. As such, the evaluation and continual improvement of these assistants is a …

被引用次数：1 相关文章所有 4 个版本

[PDF] arxiv.org

Natural Language Processing RELIES on Linguistics

J Opitz, S Wein, N Schneider - arXiv preprint arXiv:2405.05966, 2024 - arxiv.org

Large Language Models (LLMs) have become capable of generating highly fluent text in
certain languages, without modules specially designed to capture grammar or semantic …

被引用次数：3 相关文章所有 2 个版本

[PDF] springer.com

高级搜索

QQ 群