Stochastic dynamic programming with factored representations

K Khetarpal, M Riemer, I Rish, D Precup - Journal of Artificial Intelligence …, 2022 - jair.org

In this article, we aim to provide a literature review of different formulations and approaches
to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We …

被引用次数：266 相关文章所有 9 个版本

[PDF] jair.org

Decision-theoretic planning: Structural assumptions and computational leverage

C Boutilier, T Dean, S Hanks - Journal of Artificial Intelligence Research, 1999 - jair.org

Planning under uncertainty is a central problem in the study of automated sequential
decision making, and has been addressed by researchers in many different fields, including …

被引用次数：1581 相关文章所有 27 个版本

[PDF] jair.org Full View

A survey of zero-shot generalisation in deep reinforcement learning

R Kirk, A Zhang, E Grefenstette, T Rocktäschel - Journal of Artificial …, 2023 - jair.org

The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations at …

被引用次数：328 相关文章所有 9 个版本

[PDF] fransoliehoek.net

[图书][B] A concise introduction to decentralized POMDPs

FA Oliehoek, C Amato - 2016 - Springer

This book presents an overview of formal decision making methods for decentralized
cooperative systems. It is aimed at graduate students and researchers in the fields of …

被引用次数：1213 相关文章所有 13 个版本

[PDF] jmlr.org

Deep exploration via randomized value functions

I Osband, B Van Roy, DJ Russo, Z Wen - Journal of Machine Learning …, 2019 - jmlr.org

We study the use of randomized value functions to guide deep exploration in reinforcement
learning. This offers an elegant means for synthesizing statistically and computationally …

被引用次数：337 相关文章所有 9 个版本

[PDF] arxiv.org

A survey on interpretable reinforcement learning

C Glanois, P Weng, M Zimmer, D Li, T Yang, J Hao… - Machine Learning, 2024 - Springer

Although deep reinforcement learning has become a promising machine learning approach
for sequential decision-making problems, it is still not mature enough for high-stake domains …

被引用次数：68 相关文章所有 3 个版本

[PDF] neurips.cc

Generalizing goal-conditioned reinforcement learning with variational causal reasoning

W Ding, H Lin, B Li, D Zhao - Advances in Neural …, 2022 - proceedings.neurips.cc

As a pivotal component to attaining generalizable solutions in human intelligence,
reasoning provides great potential for reinforcement learning (RL) agents' generalization …

被引用次数：37 相关文章所有 7 个版本

[PDF] psl.eu

[图书][B] Probabilistic graphical models: principles and techniques

D Koller, N Friedman - 2009 - books.google.com

A general framework for constructing and using probabilistic models of complex systems that
would enable a computer to use available information for making decisions. Most tasks …

被引用次数：11134 相关文章所有 13 个版本

[PDF] academia.edu

[图书][B] Dynamic bayesian networks: representation, inference and learning

KP Murphy - 2002 - search.proquest.com

Modelling sequential data is important in many areas of science and engineering. Hidden
Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they …

被引用次数：3967 相关文章所有 18 个版本

[PDF] jmlr.org

[PDF][PDF] An MDP-based recommender system.

G Shani, D Heckerman, RI Brafman… - Journal of machine …, 2005 - jmlr.org

Typical recommender systems adopt a static view of the recommendation process and treat
it as a prediction problem. We argue that it is more appropriate to view the problem of …

被引用次数：1427 相关文章所有 25 个版本

高级搜索

QQ 群