Basis function adaptation in temporal difference reinforcement learning

Active learning in materials science with emphasis on adaptive sampling using uncertainties for targeted design

T Lookman, PV Balachandran, D Xue… - npj Computational …, 2019 - nature.com

One of the main challenges in materials discovery is efficiently exploring the vast search
space for targeted properties as approaches that rely on trial-and-error are impractical. We …

被引用次数：399 相关文章所有 7 个版本

[PDF] utwente.nl

A tutorial on the cross-entropy method

PT De Boer, DP Kroese, S Mannor… - Annals of operations …, 2005 - Springer

The cross-entropy (CE) method is a new generic approach to combinatorial and multi-
extremal optimization and rare event simulation. The purpose of this tutorial is to give a …

被引用次数：3322 相关文章所有 22 个版本

[PDF] acm.org

Learning scheduling algorithms for data processing clusters

H Mao, M Schwarzkopf, SB Venkatakrishnan… - Proceedings of the …, 2019 - dl.acm.org

Efficiently scheduling data processing jobs on distributed compute clusters requires complex
algorithms. Current systems use simple, generalized heuristics and ignore workload …

被引用次数：717 相关文章所有 13 个版本

[PDF] cam.ac.uk

Resource management with deep reinforcement learning

H Mao, M Alizadeh, I Menache, S Kandula - Proceedings of the 15th …, 2016 - dl.acm.org

Resource management problems in systems and networking often manifest as difficult
online decision making tasks where appropriate solutions depend on understanding the …

被引用次数：1358 相关文章所有 12 个版本

[PDF] tandfonline.com

[图书][B] Reinforcement Learning and Stochastic Optimization: A Unified Framework for Sequential Decisions: by Warren B. Powell (ed.), Wiley (2022). Hardback. ISBN …

I Halperin - 2022 - Taylor & Francis

What is reinforcement learning? How is reinforcement learning different from stochastic
optimization? And finally, can it be used for applications to quantitative finance for my current …

被引用次数：155 相关文章所有 6 个版本

[PDF] bookfusion.com

[图书][B] Algorithms for reinforcement learning

C Szepesvári - 2022 - books.google.com

Reinforcement learning is a learning paradigm concerned with learning to control a system
so as to maximize a numerical performance measure that expresses a long-term objective …

被引用次数：2128 相关文章所有 24 个版本

[PDF] uliege.be

[图书][B] Reinforcement learning and dynamic programming using function approximators

L Busoniu, R Babuska, B De Schutter, D Ernst - 2017 - taylorfrancis.com

From household appliances to applications in robotics, engineered systems involving
complex dynamics can only be as effective as the algorithms that control them. While …

被引用次数：1251 相关文章所有 12 个版本

[图书][B] The cross-entropy method: a unified approach to combinatorial optimization, Monte-Carlo simulation, and machine learning

RY Rubinstein, DP Kroese - 2004 - Springer

This book is a comprehensive and accessible introduction to the cross-entropy (CE) method.
The CE method started life around 1997 when the first author proposed an adaptive …

被引用次数：2369 相关文章所有 10 个版本

[PDF] jmlr.org

[PDF][PDF] Policy evaluation with temporal differences: A survey and comparison

C Dann, G Neumann, J Peters - The Journal of Machine Learning …, 2014 - jmlr.org

Policy evaluation is an essential step in most reinforcement learning approaches. It yields a
value function, the quality assessment of states for a given policy, which can be used in a …

被引用次数：286 相关文章所有 21 个版本

The cross-entropy method for optimization

ZI Botev, DP Kroese, RY Rubinstein, P L'Ecuyer - Handbook of statistics, 2013 - Elsevier

The cross-entropy method is a versatile heuristic tool for solving difficult estimation and
optimization problems, based on Kullback–Leibler (or cross-entropy) minimization. As an …

被引用次数：290 相关文章所有 4 个版本

高级搜索

QQ 群