H Sun, L Han, R Yang, X Ma… - Advances in neural …, 2022 - proceedings.neurips.cc
In this work, we study the simple yet universally applicable case of reward shaping in value- based Deep Reinforcement Learning (DRL). We show that reward shifting in the form of a …
Despite the success of Random Network Distillation (RND) in various domains, it was shown as not discriminative enough to be used as an uncertainty estimator for penalizing out-of …
Learning controllers with offline data in decision-making systems is an essential area of research due to its potential to reduce the risk of applications in real-world systems …
Recent years have witnessed significant advancements in offline reinforcement learning (RL), resulting in the development of numerous algorithms with varying degrees of …
R Yang, L Yong, X Ma, H Hu… - … on Machine Learning, 2023 - proceedings.mlr.press
Offline goal-conditioned RL (GCRL) offers a way to train general-purpose agents from fully offline datasets. In addition to being conservative within the dataset, the generalization …
We present a novel observation about the behavior of offline reinforcement learning (RL) algorithms: on many benchmark datasets, offline RL can produce well-performing and safe …
In this study, we aim to enhance the arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization. We identify a previously overlooked objective …
Offline reinforcement learning (RL) is a challenging setting where existing off-policy actor- critic methods perform poorly due to the overestimation of out-of-distribution state-action …
Uncertainty quantification (UQ) is essential for creating trustworthy machine learning models. Recent years have seen a steep rise in UQ methods that can flag suspicious …