Planning for sample efficient imitation learning

ZH Yin, W Ye, Q Chen, Y Gao - Advances in Neural …, 2022 - proceedings.neurips.cc
Imitation learning is a class of promising policy learning algorithms that is free from many
practical issues with reinforcement learning, such as the reward design issue and the …

Live in the moment: Learning dynamics model adapted to evolving policy

X Wang, W Wongkamjan, R Jia… - … on Machine Learning, 2023 - proceedings.mlr.press
Abstract Model-based reinforcement learning (RL) often achieves higher sample efficiency
in practice than model-free RL by learning a dynamics model to generate samples for policy …

Ctrlformer: Learning transferable state representation for visual control via transformer

Y Mu, S Chen, M Ding, J Chen, R Chen… - arXiv preprint arXiv …, 2022 - arxiv.org
Transformer has achieved great successes in learning vision and language representation,
which is general across various downstream tasks. In visual control, learning transferable …

Flow-based recurrent belief state learning for pomdps

X Chen, YM Mu, P Luo, S Li… - … Conference on Machine …, 2022 - proceedings.mlr.press
Abstract Partially Observable Markov Decision Process (POMDP) provides a principled and
generic framework to model real world sequential decision making processes but yet …

Adaptation augmented model-based policy optimization

J Shen, H Lai, M Liu, H Zhao, Y Yu, W Zhang - Journal of Machine …, 2023 - jmlr.org
Compared to model-free reinforcement learning (RL), model-based RL is often more sample
efficient by leveraging a learned dynamics model to help decision making. However, the …

DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning

Z Gao, Y Mu, J Qu, M Hu, L Guo, P Luo, Y Lu - arXiv preprint arXiv …, 2024 - arxiv.org
Dual-arm robots offer enhanced versatility and efficiency over single-arm counterparts by
enabling concurrent manipulation of multiple objects or cooperative execution of tasks using …

Do Agents Dream of Electric Sheep?: Improving Generalization in Reinforcement Learning through Generative Learning

G Franceschelli, M Musolesi - arXiv preprint arXiv:2403.07979, 2024 - arxiv.org
The Overfitted Brain hypothesis suggests dreams happen to allow generalization in the
human brain. Here, we ask if the same is true for reinforcement learning agents as well …

多様な状態予測によるモデルベース強化学習の改善

堀内優太, 白浜公章 - 2023 年度情報処理学会関西支部支部大会 …, 2023 - ipsj.ixsq.nii.ac.jp
多様な状態予測によるモデルベース強化学習の改善 Page 1 G-23 2023 年度情報処理学会関西支部
支部大会 多様な状態予測によるモデルベース強化学習の改善 Improving Model-based …