Grokking Group Multiplication with Cosets

文章

学术资源搜索

获得 3 条结果（用时0.03秒）

我的图书馆

Grokking Group Multiplication with Cosets

在引用文章中搜索

[PDF] arxiv.org

Mechanistic Interpretability for AI Safety--A Review

L Bereska, E Gavves - arXiv preprint arXiv:2404.14082, 2024 - arxiv.org

Understanding AI systems' inner workings is critical for ensuring value alignment and safety.
This review explores mechanistic interpretability: reverse-engineering the computational …

被引用次数：10 相关文章所有 2 个版本

[PDF] arxiv.org

Interpreting grokked transformers in complex modular arithmetic

H Furuta, M Gouki, Y Iwasawa, Y Matsuo - arXiv preprint arXiv:2402.16726, 2024 - arxiv.org

Grokking has been actively explored to reveal the mystery of delayed generalization.
Identifying interpretable algorithms inside the grokked models is a suggestive hint to …

被引用次数：4 相关文章所有 2 个版本

[PDF] openreview.net

Hypothesis Testing the Circuit Hypothesis in LLMs

C Shi, N Beltran-Velez, A Nazaret, C Zheng… - ICML 2024 Workshop on … - openreview.net

Large language models (LLMs) demonstrate surprising capabilities, but we do not
understand how they are implemented. One hypothesis suggests that these capabilities are …

高级搜索

QQ 群

Grokking Group Multiplication with Cosets

Mechanistic Interpretability for AI Safety--A Review

Interpreting grokked transformers in complex modular arithmetic

Hypothesis Testing the Circuit Hypothesis in LLMs

引用