Regret analysis of bandit problems with causal background knowledge

J Berrevoets, K Kacprzyk, Z Qian… - arXiv preprint arXiv …, 2023 - arxiv.org

Causality has the potential to truly transform the way we solve a large number of real-world
problems. Yet, so far, its potential largely remains to be unlocked as causality often requires …

被引用次数：24 相关文章所有 3 个版本

[PDF] neurips.cc

Causal bandits with unknown graph structure

Y Lu, A Meisami, A Tewari - Advances in Neural …, 2021 - proceedings.neurips.cc

In causal bandit problems the action set consists of interventions on variables of a causal
graph. Several researchers have recently studied such bandit problems and pointed out …

被引用次数：41 相关文章所有 10 个版本

[PDF] neurips.cc

Approximate allocation matching for structural causal bandits with unobserved confounders

L Wei, MQ Elahi, M Ghasemi… - Advances in Neural …, 2024 - proceedings.neurips.cc

Structural causal bandit provides a framework for online decision-making problems when
causal information is available. It models the stochastic environment with a structural causal …

被引用次数：4 相关文章所有 5 个版本

[PDF] neurips.cc

Provably efficient causal reinforcement learning with confounded observational data

L Wang, Z Yang, Z Wang - Advances in Neural Information …, 2021 - proceedings.neurips.cc

Empowered by neural networks, deep reinforcement learning (DRL) achieves tremendous
empirical success. However, DRL requires a large dataset by interacting with the …

被引用次数：56 相关文章所有 6 个版本

[PDF] mlr.press

Budgeted and non-budgeted causal bandits

V Nair, V Patil, G Sinha - International Conference on …, 2021 - proceedings.mlr.press

Learning good interventions in a causal graph can be modelled as a stochastic multi-armed
bandit problem with side-information. First, we study this problem when interventions are …

被引用次数：40 相关文章所有 4 个版本

[PDF] neurips.cc

Rehearsal learning for avoiding undesired future

T Qin, TZ Wang, ZH Zhou - Advances in Neural Information …, 2024 - proceedings.neurips.cc

Abstract Machine learning (ML) models have been widely used to make predictions. Instead
of a predictive statement about future outcomes, in many situations we want to pursue a …

被引用次数：2 相关文章所有 3 个版本

[PDF] jmlr.org

Causal bandits for linear structural equation models

B Varici, K Shanmugam, P Sattigeri, A Tajer - Journal of Machine Learning …, 2023 - jmlr.org

This paper studies the problem of designing an optimal sequence of interventions in a
causal graphical model to minimize cumulative regret with respect to the best intervention in …

被引用次数：12 相关文章所有 6 个版本

[PDF] mlr.press

Additive causal bandits with unknown graph

A Malek, V Aglietti, S Chiappa - International Conference on …, 2023 - proceedings.mlr.press

We explore algorithms to select actions in the causal bandit setting where the learner can
choose to intervene on a set of random variables related by a causal graph, and the learner …

被引用次数：5 相关文章所有 6 个版本

[PDF] mlr.press

Efficient reinforcement learning with prior causal knowledge

Y Lu, A Meisami, A Tewari - Conference on Causal Learning …, 2022 - proceedings.mlr.press

Abstract We introduce causal Markov Decision Processes (C-MDPs), a new formalism for
sequential decision making which combines the standard MDP formulation with causal …

被引用次数：23 相关文章所有 5 个版本

[PDF] aaai.org

Achieving counterfactual fairness for causal bandit

W Huang, L Zhang, X Wu - Proceedings of the AAAI conference on …, 2022 - ojs.aaai.org

In online recommendation, customers arrive in a sequential and stochastic manner from an
underlying distribution and the online decision model recommends a chosen item for each …

被引用次数：22 相关文章所有 6 个版本

高级搜索

QQ 群