J Wu, H Ma, C Deng, M Long - Advances in Neural …, 2024 - proceedings.neurips.cc
Unsupervised pre-training methods utilizing large and diverse datasets have achieved tremendous success across a range of domains. Recent work has investigated such …
State-of-the-art reinforcement learning (RL) algorithms typically use random sampling (eg, $\epsilon $-greedy) for exploration, but this method fails on hard exploration tasks like …
A promising technique for exploration is to maximize the entropy of visited state distribution, ie, state entropy, by encouraging uniform coverage of visited state space. While it has been …
M Dunion, T McInroe, KS Luck… - Advances in Neural …, 2024 - proceedings.neurips.cc
Reinforcement Learning (RL) environments can produce training data with spurious correlations between features due to the amount of training data or its limited feature …
Satisfying a variety of conflicting needs in a changing environment is a fundamental challenge for any adaptive agent. Here, we show that designing an agent in a modular …
Current reinforcement learning algorithms struggle in sparse and complex environments, most notably in long-horizon manipulation tasks entailing a plethora of different sequences …
The development of autonomous agents which can interact with other agents to accomplish a given task is a core area of research in artificial intelligence and machine learning …
Cooperative multi-agent reinforcement learning (MARL) requires agents to explore to learn to cooperate. Existing value-based MARL algorithms commonly rely on random exploration …
Offline pretraining with a static dataset followed by online fine-tuning (offline-to-online, or OtO) is a paradigm well matched to a real-world RL deployment process. In this scenario, we …