Counterfactual learning of continuous stochastic policies

H Wu, W Shi, A Choudhary, MD Wang - BMC Medical Informatics and …, 2024 - Springer

Learning policies for decision-making, such as recommending treatments in clinical settings,
is important for enhancing clinical decision-support systems. However, the challenge lies in …

被引用次数：3 相关文章所有 9 个版本

[PDF] arxiv.org

Invariant policy learning: A causal perspective

S Saengkyongam, N Thams, J Peters… - IEEE transactions on …, 2023 - ieeexplore.ieee.org

Contextual bandit and reinforcement learning algorithms have been successfully used in
various interactive learning systems such as online advertising, recommender systems, and …

被引用次数：21 相关文章所有 10 个版本

[PDF] researchgate.net

Off-policy learning over heterogeneous information for recommendation

X Wang, Q Li, D Yu, G Xu - Proceedings of the ACM Web Conference …, 2022 - dl.acm.org

Reinforcement learning has recently become an active topic in recommender system
research, where the logged data that records interactions between items and users …

被引用次数：9 相关文章所有 6 个版本

[PDF] arxiv.org

Enhancing counterfactual classification via self-training

R Gao, M Biggs, W Sun, L Han - arXiv preprint arXiv:2112.04461, 2021 - arxiv.org

Unlike traditional supervised learning, in many settings only partial feedback is available.
We may only observe outcomes for the chosen actions, but not the counterfactual outcomes …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Unified PAC-Bayesian Study of Pessimism for Offline Policy Learning with Regularized Importance Sampling

I Aouali, VE Brunel, D Rohde, A Korba - arXiv preprint arXiv:2406.03434, 2024 - arxiv.org

Off-policy learning (OPL) often involves minimizing a risk estimator based on importance
weighting to correct bias from the logging policy used to collect data. However, this method …

被引用次数：1 相关文章所有 9 个版本

高级搜索

QQ 群