Efficient kernelized ucb for contextual bandits

S Vakili, J Scarlett, D Shiu… - … on Machine Learning, 2022 - proceedings.mlr.press

Kernel-based models such as kernel ridge regression and Gaussian processes are
ubiquitous in machine learning applications for regression and optimization. It is well known …

被引用次数：26 相关文章所有 5 个版本

[PDF] neurips.cc

Communication efficient distributed learning for kernelized contextual bandits

C Li, H Wang, M Wang, H Wang - Advances in Neural …, 2022 - proceedings.neurips.cc

We tackle the communication efficiency challenge of learning kernelized contextual bandits
in a distributed setting. Despite the recent advances in communication-efficient distributed …

被引用次数：17 相关文章所有 7 个版本

[PDF] mlr.press

Tight regret and complexity bounds for thompson sampling via langevin monte carlo

T Huix, M Zhang, A Durmus - International Conference on …, 2023 - proceedings.mlr.press

In this paper, we consider high dimensional contextual bandit problems. Within this setting,
Thompson Sampling and its variants have been proposed and have been successfully …

被引用次数：7 相关文章所有 3 个版本

Interactive preference analysis: A reinforcement learning framework

X Hu, S Kang, L Ren, S Zhu - European Journal of Operational Research, 2024 - Elsevier

Automated investment managers are increasingly popular in personal wealth management
due to their cost effectiveness, objectivity, and accessibility. However, it still suffers from …

被引用次数：1 相关文章所有 3 个版本

[PDF] mlr.press

Sequential counterfactual risk minimization

H Zenati, E Diemert, M Martin… - International …, 2023 - proceedings.mlr.press

Abstract Counterfactual Risk Minimization (CRM) is a framework for dealing with the logged
bandit feedback problem, where the goal is to improve a logging policy using offline data. In …

被引用次数：4 相关文章所有 12 个版本

[PDF] nsf.gov

[PDF][PDF] Learning kernelized contextual bandits in a distributed and asynchronous environment

C Li, H Wang, M Wang, H Wang - International Conference on Learning …, 2023 - par.nsf.gov

Despite the recent advances in communication-efficient distributed bandit learning, most
existing solutions are restricted to parametric models, eg, linear bandits and generalized …

被引用次数：5 相关文章所有 2 个版本

[PDF] mlr.press

Adversarial Contextual Bandits Go Kernelized

G Neu, J Olkhovskaya, S Vakili - … Conference on Algorithmic …, 2024 - proceedings.mlr.press

We study a generalization of the problem of online learning in adversarial linear contextual
bandits by incorporating loss functions that belong to a reproducing kernel Hilbert space …

Dual instrumental method for confounded kernelized bandits

X Gong, J Zhang - arXiv preprint arXiv:2209.03224, 2022 - arxiv.org

The contextual bandit problem is a theoretically justified framework with wide applications in
various fields. While the previous study on this problem usually requires independence …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org

Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits

HM Bui, E Mallada, A Liu - arXiv preprint arXiv:2411.05979, 2024 - arxiv.org

By leveraging the representation power of deep neural networks, neural upper confidence
bound (UCB) algorithms have shown success in contextual bandits. To further balance the …

Adversarial Contextual Bandits Go Kernelized

G Neu, J Olkhovskaya, S Vakili - arXiv preprint arXiv:2310.01609, 2023 - arxiv.org

We study a generalization of the problem of online learning in adversarial linear contextual
bandits by incorporating loss functions that belong to a reproducing kernel Hilbert space …

高级搜索

QQ 群