Bandit algorithms

GW Imbens - Annual Review of Statistics and Its Application, 2024 - annualreviews.org

Knowledge of causal effects is of great importance to decision makers in a wide variety of
settings. In many cases, however, these causal effects are not known to the decision makers …

被引用次数：10 相关文章所有 2 个版本

[PDF] arxiv.org

Pervasive AI for IoT applications: A survey on resource-efficient distributed artificial intelligence

E Baccour, N Mhaisen, AA Abdellatif… - … Surveys & Tutorials, 2022 - ieeexplore.ieee.org

Artificial intelligence (AI) has witnessed a substantial breakthrough in a variety of Internet of
Things (IoT) applications and services, spanning from recommendation systems and speech …

被引用次数：82 相关文章所有 10 个版本

[PDF] mlr.press

Is pessimism provably efficient for offline rl?

Y Jin, Z Yang, Z Wang - International Conference on …, 2021 - proceedings.mlr.press

We study offline reinforcement learning (RL), which aims to learn an optimal policy based on
a dataset collected a priori. Due to the lack of further interactions with the environment …

被引用次数：382 相关文章所有 7 个版本

[PDF] neurips.cc

Cal-ql: Calibrated offline rl pre-training for efficient online fine-tuning

M Nakamoto, S Zhai, A Singh… - Advances in …, 2024 - proceedings.neurips.cc

A compelling use case of offline reinforcement learning (RL) is to obtain a policy initialization
from existing datasets followed by fast online fine-tuning with limited interaction. However …

被引用次数：57 相关文章所有 7 个版本

[PDF] arxiv.org

The statistical complexity of interactive decision making

DJ Foster, SM Kakade, J Qian, A Rakhlin - arXiv preprint arXiv:2112.13487, 2021 - arxiv.org

A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …

被引用次数：168 相关文章所有 6 个版本

[PDF] mlr.press

Provably efficient reinforcement learning with linear function approximation

C Jin, Z Yang, Z Wang… - Conference on learning …, 2020 - proceedings.mlr.press

Abstract Modern Reinforcement Learning (RL) is commonly applied to practical problems
with an enormous number of states, where\emph {function approximation} must be deployed …

被引用次数：677 相关文章所有 4 个版本

[图书][B] Control systems and reinforcement learning

S Meyn - 2022 - books.google.com

A high school student can create deep Q-learning code to control her robot, without any
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …

被引用次数：125 相关文章所有 3 个版本

[PDF] mlr.press

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes

D Zhou, Q Gu, C Szepesvari - Conference on Learning …, 2021 - proceedings.mlr.press

We study reinforcement learning (RL) with linear function approximation where the
underlying transition probability kernel of the Markov decision process (MDP) is a linear …

被引用次数：216 相关文章所有 7 个版本

[PDF] mlr.press

When is partially observable reinforcement learning not scary?

Q Liu, A Chung, C Szepesvári… - Conference on Learning …, 2022 - proceedings.mlr.press

Partial observability is ubiquitous in applications of Reinforcement Learning (RL), in which
agents learn to make a sequence of decisions despite lacking complete information about …

被引用次数：86 相关文章所有 7 个版本

[PDF] jmlr.org

River: machine learning for streaming data in python

J Montiel, M Halford, SM Mastelini, G Bolmier… - Journal of Machine …, 2021 - jmlr.org

River is a machine learning library for dynamic data streams and continual learning. It
provides multiple state-of-the-art learning methods, data generators/transformers …

被引用次数：210 相关文章所有 12 个版本

高级搜索

QQ 群

Causal inference in the social sciences

Pervasive AI for IoT applications: A survey on resource-efficient distributed artificial intelligence

Is pessimism provably efficient for offline rl?

Cal-ql: Calibrated offline rl pre-training for efficient online fine-tuning

The statistical complexity of interactive decision making

Provably efficient reinforcement learning with linear function approximation

[图书][B] Control systems and reinforcement learning

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes

When is partially observable reinforcement learning not scary?

River: machine learning for streaming data in python

引用