A comparison of action selection methods for implicit policy method reinforcement learning...

D Bauso, J Gao, H Tembine - Proceedings of the 11th EAI International …, 2017 - dl.acm.org

In this paper we introduce the novel framework of distributionally robust games. These are
multi-player games where each player models the state of nature using a worst-case …

被引用次数：29 相关文章所有 5 个版本

[PDF] mdpi.com

Improved Exploration in Reinforcement Learning Environments with Low-Discrepancy Action Selection

SW Carden, JO Lindborg, Z Utic - AppliedMath, 2022 - mdpi.com

Reinforcement learning (RL) is a subdomain of machine learning concerned with achieving
optimal behavior by interacting with an unknown and potentially stochastic environment. The …

被引用次数：1 相关文章所有 3 个版本

[PDF] tum.de

Analysing Neuro-Dynamic Programming Through Non-Convex Optimisation

M Gottwald - 2024 - mediatum.ub.tum.de

Dynamic Programming and a Neural Network-based value-function approximation
approach have demonstrated superior performance in solving sequential decision making …

Least square reinforcement learning for solving inverted pendulum problem

SN Panyakaew, P Inkeaw, J Bootkrajang… - … on Computer and …, 2018 - ieeexplore.ieee.org

Inverted pendulum is one of the classic control problem that could be solved by
reinforcement learning approach. Most of the previous work consider the problem in discrete …

被引用次数：2 相关文章所有 2 个版本

[PDF] researchgate.net

[PDF][PDF] Distributionally robust games, part I: f-divergence and learning

D Bauso, J Gao, H Tembine - arXiv preprint arXiv:1702.05371, 2017 - researchgate.net

In this paper we introduce the novel framework of distributionally robust games. These are
multi-player games where each player models the state of nature using a worst-case …

被引用次数：2 相关文章

[PDF] academia.edu

[PDF][PDF] Improved Exploration in Reinforcement Learning Environments with Low-Discrepancy Action Selection. AppliedMath 2022, 2, 234–246

SW Carden, JO Lindborg, Z Utic - 2022 - academia.edu

Reinforcement learning (RL) is a subdomain of machine learning concerned with achieving
optimal behavior by interacting with an unknown and potentially stochastic environment. The …

[PDF] mdx.ac.uk

A comparison of eligibility trace and momentum on SARSA in continuous state-and action-space

BD Nichols - 2017 9th Computer Science and Electronic …, 2017 - ieeexplore.ieee.org

Here the Newton's Method direct action selection approach to continuous action-space
reinforcement learning is extended to use an eligibility trace. This is then compared to the …

被引用次数：2 相关文章所有 5 个版本

[PDF] cmu.ac.th

[PDF][PDF] LEAST SQUARE REINFORCEMENT LEARNING FOR SOLVING CART-POLE BALANCING PROBLEM

SAN PANYAKAEW - 2019 - archive.lib.cmu.ac.th

Cart-pole balancing is a classic control problem that can be solved by reinforcement
learning approach. Most of previous work consider the problem in discrete state space …

高级搜索

QQ 群