In-context reinforcement learning for variable action spaces

Training dynamics of multi-head softmax attention for in-context learning: Emergence, convergence, and optimality

S Chen, H Sheen, T Wang, Z Yang - arXiv preprint arXiv:2402.19442, 2024 - arxiv.org

We study the dynamics of gradient flow for training a multi-head softmax attention model for
in-context learning of multi-task linear regression. We establish the global convergence of …

被引用次数：34 相关文章所有 2 个版本

[PDF] arxiv.org

Retrieval-augmented decision transformer: External memory for in-context rl

T Schmied, F Paischer, V Patil, M Hofmarcher… - arXiv preprint arXiv …, 2024 - arxiv.org

In-context learning (ICL) is the ability of a model to learn a new task by observing a few
exemplars in its context. While prevalent in NLP, this capability has recently also been …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

Unveiling induction heads: Provable training dynamics and feature learning in transformers

S Chen, H Sheen, T Wang, Z Yang - arXiv preprint arXiv:2409.10559, 2024 - arxiv.org

In-context learning (ICL) is a cornerstone of large language model (LLM) functionality, yet its
theoretical foundations remain elusive due to the complexity of transformer architectures. In …

被引用次数：4 相关文章所有 4 个版本

[PDF] arxiv.org

Transformers as Game Players: Provable In-context Game-playing Capabilities of Pre-trained Models

C Shi, K Yang, J Yang, C Shen - arXiv preprint arXiv:2410.09701, 2024 - arxiv.org

The in-context learning (ICL) capability of pre-trained models based on the transformer
architecture has received growing interest in recent years. While theoretical understanding …

Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning

S Mukherjee, JP Hanna, Q Xie, R Nowak - arXiv preprint arXiv:2406.05064, 2024 - arxiv.org

In this paper, we study multi-task structured bandit problem where the goal is to learn a near-
optimal algorithm that minimizes cumulative regret. The tasks share a common structure and …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs

I Zisman, A Nikulin, A Polubarov, N Lyubaykin… - arXiv preprint arXiv …, 2024 - arxiv.org

In-context learning allows models like transformers to adapt to new tasks from a few
examples without updating their weights, a desirable trait for reinforcement learning (RL) …

SAD: State-Action Distillation for In-Context Reinforcement Learning under Random Policies

W Chen, S Paternain - arXiv preprint arXiv:2410.19982, 2024 - arxiv.org

Pretrained foundation models have exhibited extraordinary in-context learning performance,
allowing zero-shot generalization to new tasks not encountered during the pretraining. In the …

HVAC-DPT: A Decision Pretrained Transformer for HVAC Control

A Berkes - arXiv preprint arXiv:2411.19746, 2024 - arxiv.org

Building operations consume approximately 40% of global energy, with Heating, Ventilation,
and Air Conditioning (HVAC) systems responsible for up to 50% of this consumption. As …

[PDF][PDF] DISTILLING REINFORCEMENT LEARNING ALGO-RITHMS FOR IN-CONTEXT MODEL-BASED PLANNING

J Son, S Lee, G Kim - jaehyeon-son.github.io

Recent studies have demonstrated that Transformers can perform in-context reinforcement
learning (RL) by imitating a source RL algorithm. This enables them to adapt to new tasks in …

[PDF] sfedu.ru

[PDF][PDF] МЕТОД И АЛГОРИТМ ИЗВЛЕЧЕНИЯ ПРИЗНАКОВ ИЗ ЦИФРОВЫХ СИГНАЛОВ НА БАЗЕ НЕЙРОСЕТЕЙ ТРАНСФОРМЕР

ЗА Понимаш, МВ Потанин - Известия ЮФУ. Технические науки., 2024 - izv-tn.tti.sfedu.ru

В последнее время нейросетевые модели стали одним из наиболее перспективных
направлений в области автоматического извлечения признаков из цифровых сигналов …

高级搜索

QQ 群