Causal machine learning: A survey and open problems

J Kaddour, A Lynch, Q Liu, MJ Kusner… - arXiv preprint arXiv …, 2022 - arxiv.org
Causal Machine Learning (CausalML) is an umbrella term for machine learning methods
that formalize the data-generation process as a structural causal model (SCM). This …

Kernel instrumental variable regression

R Singh, M Sahani, A Gretton - Advances in Neural …, 2019 - proceedings.neurips.cc
Instrumental variable (IV) regression is a strategy for learning causal relationships in
observational data. If measurements of input X and output Y are confounded, the causal …

Rl for latent mdps: Regret guarantees and a lower bound

J Kwon, Y Efroni, C Caramanis… - Advances in Neural …, 2021 - proceedings.neurips.cc
In this work, we consider the regret minimization problem for reinforcement learning in latent
Markov Decision Processes (LMDP). In an LMDP, an MDP is randomly drawn from a set of …

Provably efficient reinforcement learning in partially observable dynamical systems

M Uehara, A Sekhari, JD Lee… - Advances in Neural …, 2022 - proceedings.neurips.cc
Abstract We study Reinforcement Learning for partially observable systems using function
approximation. We propose a new PO-bilinear framework, that is general enough to include …

Optimistic mle: A generic model-based algorithm for partially observable sequential decision making

Q Liu, P Netrapalli, C Szepesvari, C Jin - Proceedings of the 55th …, 2023 - dl.acm.org
This paper introduces a simple efficient learning algorithms for general sequential decision
making. The algorithm combines Optimism for exploration with Maximum Likelihood …

Pac reinforcement learning for predictive state representations

W Zhan, M Uehara, W Sun, JD Lee - arXiv preprint arXiv:2207.05738, 2022 - arxiv.org
In this paper we study online Reinforcement Learning (RL) in partially observable dynamical
systems. We focus on the Predictive State Representations (PSRs) model, which is an …

A minimax learning approach to off-policy evaluation in confounded partially observable markov decision processes

C Shi, M Uehara, J Huang… - … Conference on Machine …, 2022 - proceedings.mlr.press
We consider off-policy evaluation (OPE) in Partially Observable Markov Decision Processes
(POMDPs), where the evaluation policy depends only on observable variables and the …

Gec: A unified framework for interactive decision making in mdp, pomdp, and beyond

H Zhong, W Xiong, S Zheng, L Wang, Z Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
We study sample efficient reinforcement learning (RL) under the general framework of
interactive decision making, which includes Markov decision process (MDP), partially …

Future-dependent value-based off-policy evaluation in pomdps

M Uehara, H Kiyohara, A Bennett… - Advances in …, 2024 - proceedings.neurips.cc
We study off-policy evaluation (OPE) for partially observable MDPs (POMDPs) with general
function approximation. Existing methods such as sequential importance sampling …

Causal imitation learning under temporally correlated noise

G Swamy, S Choudhury, D Bagnell… - … on Machine Learning, 2022 - proceedings.mlr.press
We develop algorithms for imitation learning from policy data that was corrupted by
temporally correlated noise in expert actions. When noise affects multiple timesteps of …