H Bai, R Cheng, Y Jin - Intelligent Computing, 2023 - spj.science.org
Reinforcement learning (RL) is a machine learning approach that trains agents to maximize cumulative rewards through interactions with environments. The integration of RL with deep …
In reinforcement learning (RL), a reward function that aligns exactly with a task's true performance metric is often necessarily sparse. For example, a true task metric might …
H Li, X Yang, Z Wang, X Zhu, J Zhou… - Proceedings of the …, 2024 - openaccess.thecvf.com
Many reinforcement learning environments (eg Minecraft) provide only sparse rewards that indicate task completion or failure with binary values. The challenge in exploration efficiency …
Preference-based reinforcement learning (RL) provides a framework to train agents using human preferences between two behaviors. However, preference-based RL has been …
We propose a method for meta-learning reinforcement learning algorithms by searching over the space of computational graphs which compute the loss function for a value-based …
Reinforcement Learning and recently Deep Reinforcement Learning are popular methods for solving sequential decision making problems modeled as Markov Decision Processes …
M Hwang, G Lee, H Kee, CW Kim… - Advances in Neural …, 2024 - proceedings.neurips.cc
Reinforcement learning from human feedback (RLHF) alleviates the problem of designing a task-specific reward function in reinforcement learning by learning it from human preference …
K Sozykin, A Chertkov, R Schutski… - Advances in …, 2022 - proceedings.neurips.cc
We present a novel procedure for optimization based on the combination of efficient quantized tensor train representation and a generalized maximum matrix volume principle …
Vizier is the de-facto blackbox optimization service across Google, having optimized some of Google's largest products and research efforts. To operate at the scale of tuning thousands …