A survey on multi-task learning

Y Zhang, Q Yang - IEEE transactions on knowledge and data …, 2021 - ieeexplore.ieee.org
Multi-Task Learning (MTL) is a learning paradigm in machine learning and its aim is to
leverage useful information contained in multiple related tasks to help improve the …

Reinforcement learning for personalization: A systematic literature review

F Den Hengst, EM Grua, A el Hassouni… - Data …, 2020 - content.iospress.com
The major application areas of reinforcement learning (RL) have traditionally been game
playing and continuous control. In recent years, however, RL has been increasingly applied …

Adaptive methods for real-world domain generalization

A Dubey, V Ramanathan… - Proceedings of the …, 2021 - openaccess.thecvf.com
Invariant approaches have been remarkably successful in tackling the problem of domain
generalization, where the objective is to perform inference on data distributions different …

Meta-thompson sampling

B Kveton, M Konobeev, M Zaheer… - International …, 2021 - proceedings.mlr.press
Efficient exploration in bandits is a fundamental online learning problem. We propose a
variant of Thompson sampling that learns to explore better as it interacts with bandit …

Designing reinforcement learning algorithms for digital interventions: pre-implementation guidelines

AL Trella, KW Zhang, I Nahum-Shani, V Shetty… - Algorithms, 2022 - mdpi.com
Online reinforcement learning (RL) algorithms are increasingly used to personalize digital
interventions in the fields of mobile health and online education. Common challenges in …

Hierarchical bayesian bandits

J Hong, B Kveton, M Zaheer… - International …, 2022 - proceedings.mlr.press
Abstract Meta-, multi-task, and federated learning can be all viewed as solving similar tasks,
drawn from a distribution that reflects task similarities. We provide a unified view of all these …

Multi-armed bandit experimental design: Online decision-making and adaptive inference

D Simchi-Levi, C Wang - International Conference on …, 2023 - proceedings.mlr.press
Multi-armed bandit has been well-known for its efficiency in online decision-making in terms
of minimizing the loss of the participants' welfare during experiments (ie, the regret). In …

No regrets for learning the prior in bandits

S Basu, B Kveton, M Zaheer… - Advances in neural …, 2021 - proceedings.neurips.cc
Abstract We propose AdaTS, a Thompson sampling algorithm that adapts sequentially to
bandit tasks that it interacts with. The key idea in AdaTS is to adapt to an unknown task prior …

Online clustering of bandits with misspecified user models

Z Wang, J Xie, X Liu, S Li, J Lui - Advances in Neural …, 2024 - proceedings.neurips.cc
The contextual linear bandit is an important online learning problem where given arm
features, a learning agent selects an arm at each round to maximize the cumulative rewards …

Meta-learning with stochastic linear bandits

L Cella, A Lazaric, M Pontil - International Conference on …, 2020 - proceedings.mlr.press
We investigate meta-learning procedures in the setting of stochastic linear bandits tasks.
The goal is to select a learning algorithm which works well on average over a class of …