S Levine, D Shah - … Transactions of the Royal Society B, 2023 - royalsocietypublishing.org
Navigation is one of the most heavily studied problems in robotics and is conventionally approached as a geometric mapping and planning problem. However, real-world navigation …
Diffusion models are a class of flexible generative models trained with an approximation to the log-likelihood objective. However, most use cases of diffusion models are not concerned …
In this work, we present a scalable reinforcement learning method for training multi-task policies from large offline datasets that can leverage both human demonstrations and …
G Wang, S Cheng, X Zhan, X Li, S Song… - arXiv preprint arXiv …, 2023 - arxiv.org
Nowadays, open-source large language models like LLaMA have emerged. Recent developments have incorporated supervised fine-tuning (SFT) and reinforcement learning …
Effective offline RL methods require properly handling out-of-distribution actions. Implicit Q- learning (IQL) addresses this by training a Q-function using only dataset actions through a …
Reinforcement learning (RL) provides a theoretical framework for continuously improving an agent's behavior via trial and error. However, efficiently learning policies from scratch can be …
Guided sampling is a vital approach for applying diffusion models in real-world tasks that embeds human-defined guidance during the sampling procedure. This paper considers a …
Safe reinforcement learning (RL) trains a constraint satisfaction policy by interacting with the environment. We aim to tackle a more challenging problem: learning a safe policy from an …
T Yamagata, A Khalil… - … on Machine Learning, 2023 - proceedings.mlr.press
Recent works have shown that tackling offline reinforcement learning (RL) with a conditional policy produces promising results. The Decision Transformer (DT) combines the conditional …