A mechanistic interpretation of syllogistic reasoning in auto-regressive language models

G Kim, M Valentino, A Freitas - arXiv preprint arXiv:2408.08590, 2024 - arxiv.org
Recent studies on logical reasoning in auto-regressive Language Models (LMs) have
sparked a debate on whether such models can learn systematic reasoning principles during …

Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers

BK Chen, T Hu, H Jin, HK Lee, K Kawaguchi - arXiv preprint arXiv …, 2024 - arxiv.org
In-Context Learning (ICL) has been a powerful emergent property of large language models
that has attracted increasing attention in recent years. In contrast to regular gradient-based …

Transformers Provably Solve Parity Efficiently with Chain of Thought

J Kim, T Suzuki - arXiv preprint arXiv:2410.08633, 2024 - arxiv.org
This work provides the first theoretical analysis of training transformers to solve complex
problems by recursively generating intermediate states, analogous to fine-tuning for chain-of …