Pirlnav: Pretraining with imitation and rl finetuning for objectnav

R Ramrakhya, D Batra, E Wijmans… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We study ObjectGoal Navigation--where a virtual robot situated in a new
environment is asked to navigate to an object. Prior work has shown that imitation learning …

Hybrid rl: Using both offline and online data can make rl efficient

Y Song, Y Zhou, A Sekhari, JA Bagnell… - arXiv preprint arXiv …, 2022 - arxiv.org
We consider a hybrid reinforcement learning setting (Hybrid RL), in which an agent has
access to an offline dataset and the ability to collect experience via real-world online …

Reincarnating reinforcement learning: Reusing prior computation to accelerate progress

R Agarwal, M Schwarzer, PS Castro… - Advances in neural …, 2022 - proceedings.neurips.cc
Learning tabula rasa, that is without any prior knowledge, is the prevalent workflow in
reinforcement learning (RL) research. However, RL systems, when applied to large-scale …

Watch and match: Supercharging imitation with regularized optimal transport

S Haldar, V Mathur, D Yarats… - Conference on Robot …, 2023 - proceedings.mlr.press
Imitation learning holds tremendous promise in learning policies efficiently for complex
decision making problems. Current state-of-the-art algorithms often use inverse …

Offline multi-agent reinforcement learning with knowledge distillation

WC Tseng, THJ Wang, YC Lin… - Advances in Neural …, 2022 - proceedings.neurips.cc
We introduce an offline multi-agent reinforcement learning (offline MARL) framework that
utilizes previously collected data without additional online data collection. Our method …

Vrl3: A data-driven framework for visual deep reinforcement learning

C Wang, X Luo, K Ross, D Li - Advances in Neural …, 2022 - proceedings.neurips.cc
We propose VRL3, a powerful data-driven framework with a simple design for solving
challenging visual deep reinforcement learning (DRL) tasks. We analyze a number of major …

Augmented behavioral annotation tools, with application to multimodal datasets and models: a systematic review

E Watson, T Viana, S Zhang - AI, 2023 - mdpi.com
Annotation tools are an essential component in the creation of datasets for machine learning
purposes. Annotation tools have evolved greatly since the turn of the century, and now …

Inverse reinforcement learning without reinforcement learning

G Swamy, D Wu, S Choudhury… - … on Machine Learning, 2023 - proceedings.mlr.press
Abstract Inverse Reinforcement Learning (IRL) is a powerful set of techniques for imitation
learning that aims to learn a reward function that rationalizes expert demonstrations …

Policy expansion for bridging offline-to-online reinforcement learning

H Zhang, W Xu, H Yu - arXiv preprint arXiv:2302.00935, 2023 - arxiv.org
Pre-training with offline data and online fine-tuning using reinforcement learning is a
promising strategy for learning control policies by leveraging the best of both worlds in terms …

Dataset reset policy optimization for rlhf

JD Chang, W Shan, O Oertell, K Brantley… - arXiv preprint arXiv …, 2024 - arxiv.org
Reinforcement Learning (RL) from Human Preference-based feedback is a popular
paradigm for fine-tuning generative models, which has produced impressive models such as …