Lightzero: A unified benchmark for monte carlo tree search in general sequential decision scenarios

Y Niu, Y Pu, Z Yang, X Li, T Zhou… - Advances in …, 2024 - proceedings.neurips.cc
Building agents based on tree-search planning capabilities with learned models has
achieved remarkable success in classic decision-making problems, such as Go and Atari …

Reinforcement learning with knowledge representation and reasoning: A brief survey

C Yu, X Zheng, HH Zhuo, H Wan, W Luo - arXiv preprint arXiv:2304.12090, 2023 - arxiv.org
Reinforcement Learning (RL) has achieved tremendous development in recent years, but
still faces significant obstacles in addressing complex real-life problems due to the issues of …

A generalist dynamics model for control

I Schubert, J Zhang, J Bruce, S Bechtle… - arXiv preprint arXiv …, 2023 - arxiv.org
We investigate the use of transformer sequence models as dynamics models (TDMs) for
control. We find that TDMs exhibit strong generalization capabilities to unseen …

Towards biologically plausible model-based reinforcement learning in recurrent spiking networks by dreaming new experiences

C Capone, PS Paolucci - Scientific Reports, 2024 - nature.com
Humans and animals can learn new skills after practicing for a few hours, while current
reinforcement learning algorithms require a large amount of data to achieve good …

Efficient Imitation Learning with Conservative World Models

V Kolev, R Rafailov, K Hatch, J Wu, C Finn - arXiv preprint arXiv …, 2024 - arxiv.org
We tackle the problem of policy learning from expert demonstrations without a reward
function. A central challenge in this space is that these policies fail upon deployment due to …

Towards high efficient long-horizon planning with expert-guided motion-encoding tree search

T Zhou, E Lyu, G Cen, Z Zha, S Qi… - IEEE Robotics and …, 2024 - ieeexplore.ieee.org
Autonomous driving holds promise for increased safety, optimized traffic management, and
a new level of convenience in transportation. While model-based reinforcement learning …

Addressing implicit bias in adversarial imitation learning with mutual information

L Zhang, Q Liu, F Zhu, Z Huang - Neural Networks, 2023 - Elsevier
Adversarial imitation learning (AIL) is a powerful method for automated decision systems
due to training a policy efficiently by mimicking expert demonstrations. However, implicit bias …

An Efficient Node Selection Policy for Monte Carlo Tree Search with Neural Networks

X Liu, Y Peng, G Zhang, R Zhou - INFORMS Journal on …, 2024 - pubsonline.informs.org
Monte Carlo tree search (MCTS) has been gaining increasing popularity, and the success of
AlphaGo has prompted a new trend of incorporating a value network and a policy network …

Boosting Reinforcement Learning and Planning with Demonstrations: A Survey

T Mu, H Su - arXiv preprint arXiv:2303.13489, 2023 - arxiv.org
Although reinforcement learning has seen tremendous success recently, this kind of trial-
and-error learning can be impractical or inefficient in complex environments. The use of …

Force-Based Robotic Imitation Learning: A Two-Phase Approach for Construction Assembly Tasks

H You, Y Ye, T Zhou, J Du - arXiv preprint arXiv:2501.14942, 2025 - arxiv.org
The drive for efficiency and safety in construction has boosted the role of robotics and
automation. However, complex tasks like welding and pipe insertion pose challenges due to …