Flow-based recurrent belief state learning for pomdps

SE Li - 2023 - Springer

Since the beginning of the 21st century, artificial intelligence (AI) has been reshaping almost
all areas of human society, which has high potential to spark the fourth industrial revolution …

被引用次数：86 相关文章所有 4 个版本

[PDF] arxiv.org

Latent state marginalization as a low-cost approach for improving exploration

D Zhang, A Courville, Y Bengio, Q Zheng… - arXiv preprint arXiv …, 2022 - arxiv.org

While the maximum entropy (MaxEnt) reinforcement learning (RL) framework--often touted
for its exploration and robustness capabilities--is usually motivated from a probabilistic …

被引用次数：11 相关文章所有 5 个版本

[PDF] neurips.cc

Offline RL with discrete proxy representations for generalizability in POMDPs

P Gu, X Cai, D Xing, X Wang… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract Offline Reinforcement Learning (RL) has demonstrated promising results in various
applications by learning policies from previously collected datasets, reducing the need for …

Learning belief representations for partially observable deep RL

A Wang, AC Li, TQ Klassen, RT Icarte… - International …, 2023 - proceedings.mlr.press

Many important real-world Reinforcement Learning (RL) problems involve partial
observability and require policies with memory. Unfortunately, standard deep RL algorithms …

被引用次数：2 相关文章所有 6 个版本

[PDF] arxiv.org

The Wasserstein Believer: Learning Belief Updates for Partially Observable Environments through Reliable Latent Space Models

R Avalos, F Delgrange, A Nowé, GA Pérez… - arXiv preprint arXiv …, 2023 - arxiv.org

Partially Observable Markov Decision Processes (POMDPs) are used to model
environments where the full state cannot be perceived by an agent. As such the agent needs …

被引用次数：3 相关文章所有 9 个版本

[PDF] arxiv.org

DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning

Z Gao, Y Mu, J Qu, M Hu, L Guo, P Luo, Y Lu - arXiv preprint arXiv …, 2024 - arxiv.org

Dual-arm robots offer enhanced versatility and efficiency over single-arm counterparts by
enabling concurrent manipulation of multiple objects or cooperative execution of tasks using …

被引用次数：1 相关文章

[PDF] arxiv.org

高级搜索

QQ 群