Reinforcement learning can be more efficient with multiple rewards

C Dann, Y Mansour, M Mohri - International Conference on …, 2023 - proceedings.mlr.press
Reward design is one of the most critical and challenging aspects when formulating a task
as a reinforcement learning (RL) problem. In practice, it often takes several attempts of …

Multi-task representation learning for pure exploration in linear bandits

Y Du, L Huang, W Sun - International Conference on …, 2023 - proceedings.mlr.press
Despite the recent success of representation learning in sequential decision making, the
study of the pure exploration scenario (ie, identify the best option and minimize the sample …

Can Q-learning be improved with advice?

N Golowich, A Moitra - Conference on Learning Theory, 2022 - proceedings.mlr.press
Despite rapid progress in theoretical reinforcement learning (RL) over the last few years,
most of the known guarantees are worst-case in nature, failing to take advantage of structure …

Horizon-free and variance-dependent reinforcement learning for latent markov decision processes

R Zhou, R Wang, SS Du - International Conference on …, 2023 - proceedings.mlr.press
We study regret minimization for reinforcement learning (RL) in Latent Markov Decision
Processes (LMDPs) with context in hindsight. We design a novel model-based algorithmic …

On the power of pre-training for generalization in RL: provable benefits and hardness

H Ye, X Chen, L Wang, SS Du - International Conference on …, 2023 - proceedings.mlr.press
Abstract Generalization in Reinforcement Learning (RL) aims to train an agent during
training that generalizes to the target environment. In this work, we first point out that RL …

Provably efficient offline reinforcement learning with perturbed data sources

C Shi, W Xiong, C Shen, J Yang - … Conference on Machine …, 2023 - proceedings.mlr.press
Existing theoretical studies on offline reinforcement learning (RL) mostly consider a dataset
sampled directly from the target task. In practice, however, data often come from several …

Thompson sampling for robust transfer in multi-task bandits

Z Wang, C Zhang, K Chaudhuri - arXiv preprint arXiv:2206.08556, 2022 - arxiv.org
We study the problem of online multi-task learning where the tasks are performed within
similar but not necessarily identical multi-armed bandit environments. In particular, we study …

Multitask transfer learning with kernel representation

Y Zhang, S Ying, Z Wen - Neural Computing and Applications, 2022 - Springer
In many real-world applications, collecting and labeling the data is expensive and time-
consuming. Thus, there is a need to obtain a high-performance learner by leveraging the …

Efficient multi-task reinforcement learning via selective behavior sharing

G Zhang, A Jain, I Hwang, SH Sun, JJ Lim - arXiv preprint arXiv …, 2023 - arxiv.org
The ability to leverage shared behaviors between tasks is critical for sample-efficient multi-
task reinforcement learning (MTRL). While prior methods have primarily explored parameter …

Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks

Z Xu, Z Xu, R Jiang, P Stone, A Tewari - arXiv preprint arXiv:2403.01636, 2024 - arxiv.org
Multitask Reinforcement Learning (MTRL) approaches have gained increasing attention for
its wide applications in many important Reinforcement Learning (RL) tasks. However, while …