Self-supervised representations for multi-view reinforcement learning

H Yang, D Shi, G Xie, Y Peng, Y Zhang… - The 38th Conference …, 2022 - openreview.net
H Yang, D Shi, G Xie, Y Peng, Y Zhang, Y Yang, S Yang
The 38th Conference on Uncertainty in Artificial Intelligence, 2022openreview.net
Learning policies from raw, pixel images are quite important for the real-world application of
deep reinforcement learning (RL). Standard model-free RL algorithms focus on single-view
settings and unify the representation learning and policy learning into an end-to-end training
process. However, such a learning paradigm is sample-inefficiency and sensitive to hyper-
parameters when supervised merely by the reward signals. Based on this, we present Self-
Supervised Representations (S2R) for multi-view reinforcement learning, a sample-efficient …
Learning policies from raw, pixel images are quite important for the real-world application of deep reinforcement learning (RL). Standard model-free RL algorithms focus on single-view settings and unify the representation learning and policy learning into an end-to-end training process. However, such a learning paradigm is sample-inefficiency and sensitive to hyper-parameters when supervised merely by the reward signals. Based on this, we present Self-Supervised Representations (S2R) for multi-view reinforcement learning, a sample-efficient representation learning method for learning features from high-dimensional images. In S2R, we introduce a representation learning framework and define a novel multi-view auxiliary objective based on the multi-view image states and Conditional Entropy Bottleneck (CEB) principle. We integrate S2R with the deep RL agent to learn robust representations that preserve task-relevant information while discarding task-irrelevant information and find optimal policies that maximize the expected return. Empirically, we demonstrate the effectiveness of S2R in the visual DeepMind Control (DMControl) suite and show its better performance on the default DMControl tasks and their variants by replacing the tasks' default background with a random image or natural video.
openreview.net
以上显示的是最相近的搜索结果。 查看全部搜索结果