Uncertainty-aware policy optimization: A robust, adaptive trust region approach

Combining imitation and deep reinforcement learning to human-level performance on a virtual foraging task

V Giammarino, MF Dunne, KN Moore… - Adaptive …, 2024 - journals.sagepub.com

We develop a framework to learn bio-inspired foraging policies using human data. We
conduct an experiment where humans are virtually immersed in an open field foraging …

被引用次数：3 相关文章所有 3 个版本

[HTML] sciencedirect.com

[HTML][HTML] Trust region policy optimization via entropy regularization for Kullback–Leibler divergence constraint

H Xu, J Xuan, G Zhang, J Lu - Neurocomputing, 2024 - Elsevier

Trust region policy optimization (TRPO) is one of the landmark policy optimization algorithms
in deep reinforcement learning. Its purpose is to maximize a surrogate objective based on …

被引用次数：3 相关文章

[PDF] arxiv.org

Generalized policy improvement algorithms with theoretically supported sample reuse

J Queeney, IC Paschalidis… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

We develop a new class of model-free deep reinforcement learning algorithms for data-
driven, learning-based control. Our Generalized Policy Improvement algorithms combine the …

被引用次数：4 相关文章所有 2 个版本

Dynamic Environment-driven Autonomous Drone Path Planning via Deep Reinforcement Learning

Q Wang, J Gu - 2024 International Joint Conference on Neural …, 2024 - ieeexplore.ieee.org

Path planning is a key enabling technology for drone-based applications, where the drone
may encounter situations that require impromptu decisions to avoid task failure. Traditional …

[PDF] arxiv.org

Combining imitation and deep reinforcement learning to accomplish human-level performance on a virtual foraging task

V Giammarino, MF Dunne, KN Moore… - arXiv preprint arXiv …, 2022 - arxiv.org

We develop a simple framework to learn bio-inspired foraging policies using human data.
We conduct an experiment where humans are virtually immersed in an open field foraging …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

DFF: Decision-Focused Fine-tuning for Smarter Predict-then-Optimize with Limited Data

J Yang, E Liang, Z Su, Z Zou, P Zhen, J Guo… - arXiv preprint arXiv …, 2025 - arxiv.org

Decision-focused learning (DFL) offers an end-to-end approach to the predict-then-optimize
(PO) framework by training predictive models directly on decision loss (DL), enhancing …

相关文章所有 2 个版本

[PDF] bu.edu

Reliable deep reinforcement learning: stable training and robust deployment

J Queeney - 2023 - search.proquest.com

Deep reinforcement learning (RL) represents a data-driven framework for sequential
decision making that has demonstrated the ability to solve challenging control tasks. This …

相关文章所有 2 个版本

On the use of expert data to imitate behavior and accelerate Reinforcement Learning

V Giammarino - 2024 - search.proquest.com

This dissertation examines the integration of expert datasets to enhance the data efficiency
of online Deep Reinforcement Learning (DRL) algorithms in large state and action space …

相关文章所有 2 个版本

[PDF] academia.edu

[PDF][PDF] Learning from humans: Combining imitation and deep reinforcement learning to accomplish human-level performance on a virtual foraging task

V Giammarino, MF Dunne, KN Moore… - arXiv preprint arXiv …, 2022 - academia.edu

We develop a method to learn bio-inspired foraging policies using human data. We conduct
an experiment where humans are virtually immersed in an open field foraging environment …

被引用次数：1 相关文章

[PDF] github.io

[PDF][PDF] Model-Based Reinforcement Learning under Sparse Rewards

R Akash - raviakash.github.io

Reinforcement Learning (RL) has recently seen significant advances over the last decade in
simulated and controlled environments. RL has shown impressive results in difficult decision …

相关文章所有 2 个版本

高级搜索

QQ 群