Reinforcement learning as classification: Leveraging modern classifiers

D Silver, J Schrittwieser, K Simonyan, I Antonoglou… - nature, 2017 - nature.com

A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa,
superhuman proficiency in challenging domains. Recently, AlphaGo became the first …

被引用次数：11890 相关文章所有 42 个版本

[PDF] tamu.edu

[PDF][PDF] Trust Region Policy Optimization

J Schulman - arXiv preprint arXiv:1502.05477, 2015 - people.engr.tamu.edu

In this article, we describe a method for optimizing control policies, with guaranteed
monotonic improvement. By making several approximations to the theoretically-justified …

被引用次数：9203 相关文章

[PDF] arxiv.org

Deep reinforcement learning in large discrete action spaces

G Dulac-Arnold, R Evans, H van Hasselt… - arXiv preprint arXiv …, 2015 - arxiv.org

Being able to reason in an environment with a large number of discrete actions is essential
to bringing reinforcement learning to a larger class of problems. Recommender systems …

被引用次数：787 相关文章所有 6 个版本

[PDF] neurips.cc

Learn what not to learn: Action elimination with deep reinforcement learning

T Zahavy, M Haroush, N Merlis… - Advances in neural …, 2018 - proceedings.neurips.cc

Learning how to act when there are many available actions in each state is a challenging
task for Reinforcement Learning (RL) agents, especially when many of the actions are …

被引用次数：245 相关文章所有 10 个版本

[PDF] uva.es

Adversarial environment reinforcement learning algorithm for intrusion detection

G Caminero, M Lopez-Martin, B Carro - Computer Networks, 2019 - Elsevier

Intrusion detection is a crucial service in today's data networks, and the search for new fast
and robust algorithms that are capable of detecting and classifying dangerous traffic is …

被引用次数：219 相关文章所有 4 个版本

[PDF] uliege.be

[图书][B] Reinforcement learning and dynamic programming using function approximators

L Busoniu, R Babuska, B De Schutter, D Ernst - 2017 - taylorfrancis.com

From household appliances to applications in robotics, engineered systems involving
complex dynamics can only be as effective as the algorithms that control them. While …

被引用次数：1303 相关文章所有 12 个版本

[PDF] jmlr.org

[PDF][PDF] Tree-based batch mode reinforcement learning

D Ernst, P Geurts, L Wehenkel - Journal of Machine Learning Research, 2005 - jmlr.org

Reinforcement learning aims to determine an optimal control policy from interaction with a
system or from observations gathered from a system. In batch mode, it can be achieved by …

被引用次数：1527 相关文章所有 21 个版本

[PDF] uliege.be

Approximate reinforcement learning: An overview

L Buşoniu, D Ernst, B De Schutter… - 2011 IEEE symposium …, 2011 - ieeexplore.ieee.org

Reinforcement learning (RL) allows agents to learn how to optimally interact with complex
environments. Fueled by recent advances in approximation-based algorithms, RL has …

被引用次数：89 相关文章所有 15 个版本

[PDF] arxiv.org

On the role of planning in model-based deep reinforcement learning

JB Hamrick, AL Friesen, F Behbahani, A Guez… - arXiv preprint arXiv …, 2020 - arxiv.org

Model-based planning is often thought to be necessary for deep, careful reasoning and
generalization in artificial agents. While recent successes of model-based reinforcement …

被引用次数：89 相关文章所有 3 个版本

[HTML] ieee-jas.net

Multiagent reinforcement learning: Rollout and policy iteration

D Bertsekas - IEEE/CAA Journal of Automatica Sinica, 2021 - ieeexplore.ieee.org

We discuss the solution of complex multistage decision problems using methods that are
based on the idea of policy iteration (PI), ie, start from some base policy and generate an …

被引用次数：114 相关文章所有 9 个版本

高级搜索

QQ 群