What is essential for unseen goal generalization of offline goal-conditioned rl?

R Yang, L Yong, X Ma, H Hu… - … on Machine Learning, 2023 - proceedings.mlr.press
Offline goal-conditioned RL (GCRL) offers a way to train general-purpose agents from fully
offline datasets. In addition to being conservative within the dataset, the generalization …

Rewards-in-context: Multi-objective alignment of foundation models with dynamic preference adjustment

R Yang, X Pan, F Luo, S Qiu, H Zhong, D Yu… - arXiv preprint arXiv …, 2024 - arxiv.org
We consider the problem of multi-objective alignment of foundation models with human
preferences, which is a critical step towards helpful and harmless AI systems. However, it is …

Imitating past successes can be very suboptimal

B Eysenbach, S Udatha… - Advances in Neural …, 2022 - proceedings.neurips.cc
Prior work has proposed a simple strategy for reinforcement learning (RL): label experience
with the outcomes achieved in that experience, and then imitate the relabeled experience …

Efficient bimanual handover and rearrangement via symmetry-aware actor-critic learning

Y Li, C Pan, H Xu, X Wang, Y Wu - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Bimanual manipulation is important for building intelligent robots that unlock richer skills
than single arms. We consider a multi-object bimanual rearrangement task, where a …

A connection between one-step RL and critic regularization in reinforcement learning

B Eysenbach, M Geist, S Levine… - International …, 2023 - proceedings.mlr.press
As with any machine learning problem with limited data, effective offline RL algorithms
require careful regularization to avoid overfitting. One class of methods, known as one-step …

Goplan: Goal-conditioned offline reinforcement learning by planning with learned models

M Wang, R Yang, X Chen, H Sun, M Fang… - arXiv preprint arXiv …, 2023 - arxiv.org
Offline Goal-Conditioned RL (GCRL) offers a feasible paradigm for learning general-
purpose policies from diverse and multi-task offline datasets. Despite notable recent …

Rerogcrl: Representation-based robustness in goal-conditioned reinforcement learning

X Yin, S Wu, J Liu, M Fang, X Zhao, X Huang… - arXiv preprint arXiv …, 2023 - arxiv.org
While Goal-Conditioned Reinforcement Learning (GCRL) has gained attention, its
algorithmic robustness, particularly against adversarial perturbations, remains unexplored …

Goal-conditioned Q-learning as knowledge distillation

A Levine, S Feizi - Proceedings of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org
Many applications of reinforcement learning can be formalized as goal-conditioned
environments, where, in each episode, there is a" goal" that affects the rewards obtained …

Imaginary hindsight experience replay: Curious model-based learning for sparse reward tasks

R McCarthy, Q Wang, SJ Redmond - arXiv preprint arXiv:2110.02414, 2021 - arxiv.org
Model-based reinforcement learning is a promising learning strategy for practical robotic
applications due to its improved data-efficiency versus model-free counterparts. However …

Goal-conditioned offline reinforcement learning through state space partitioning

M Wang, Y Jin, G Montana - Machine Learning, 2024 - Springer
Offline reinforcement learning (RL) aims to create policies for sequential decision-making
using exclusively offline datasets. This presents a significant challenge, especially when …