Programmatic reinforcement learning without oracles

How to reuse and compose knowledge for a lifetime of tasks: A survey on continual learning and functional composition

JA Mendez, E Eaton - arXiv preprint arXiv:2207.07730, 2022 - arxiv.org

A major goal of artificial intelligence (AI) is to create an agent capable of acquiring a general
understanding of the world. Such an agent would require the ability to continually …

被引用次数：22 相关文章所有 3 个版本

[PDF] arxiv.org

A survey on interpretable reinforcement learning

C Glanois, P Weng, M Zimmer, D Li, T Yang, J Hao… - Machine Learning, 2024 - Springer

Although deep reinforcement learning has become a promising machine learning approach
for sequential decision-making problems, it is still not mature enough for high-stake domains …

被引用次数：67 相关文章所有 3 个版本

[PDF] arxiv.org

Artificial collective intelligence engineering: a survey of concepts and perspectives

R Casadei - Artificial Life, 2023 - ieeexplore.ieee.org

Collectiveness is an important property of many systems—both natural and artificial. By
exploiting a large number of individuals, it is often possible to produce effects that go far …

被引用次数：16 相关文章所有 10 个版本

[PDF] neurips.cc

Efficient symbolic policy learning with differentiable symbolic expression

J Guo, R Zhang, S Peng, Q Yi, X Hu… - Advances in …, 2024 - proceedings.neurips.cc

Deep reinforcement learning (DRL) has led to a wide range of advances in sequential
decision-making tasks. However, the complexity of neural network policies makes it difficult …

被引用次数：2 相关文章所有 5 个版本

[HTML] springer.com

[HTML][HTML] Explainable reinforcement learning (XRL): a systematic literature review and taxonomy

Y Bekkemoen - Machine Learning, 2024 - Springer

In recent years, reinforcement learning (RL) systems have shown impressive performance
and remarkable achievements. Many achievements can be attributed to combining RL with …

被引用次数：4 相关文章所有 4 个版本

[PDF] aaai.org

π-light: Programmatic interpretable reinforcement learning for resource-limited traffic signal control

Y Gu, K Zhang, Q Liu, W Gao, L Li, J Zhou - Proceedings of the AAAI …, 2024 - ojs.aaai.org

The recent advancements in Deep Reinforcement Learning (DRL) have significantly
enhanced the performance of adaptive Traffic Signal Control (TSC). However, DRL policies …

被引用次数：2 相关文章

[PDF] aaai.org

Show me the way! Bilevel search for synthesizing programmatic strategies

DS Aleixo, LHS Lelis - Proceedings of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org

The synthesis of programmatic strategies requires one to search in large non-differentiable
spaces of computer programs. Current search algorithms use self-play approaches to guide …

被引用次数：8 相关文章所有 4 个版本

[PDF] arxiv.org

Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning

H Kohler, Q Delfosse, R Akrour, K Kersting… - arXiv preprint arXiv …, 2024 - arxiv.org

Deep reinforcement learning agents are prone to goal misalignments. The black-box nature
of their policies hinders the detection and correction of such misalignments, and the trust …

被引用次数：2 相关文章所有 3 个版本

[HTML] springer.com

[HTML][HTML] Verification-guided programmatic controller synthesis

Y Wang, H Zhu - International Conference on Tools and Algorithms for …, 2023 - Springer

We present a verification-based learning framework VEL that synthesizes safe programmatic
controllers for environments with continuous state and action spaces. The key idea is the …

被引用次数：3 相关文章所有 5 个版本

[PDF] arxiv.org

Synthesizing programmatic policies with actor-critic algorithms and relu networks

S Orfanos, LHS Lelis - arXiv preprint arXiv:2308.02729, 2023 - arxiv.org

Programmatically Interpretable Reinforcement Learning (PIRL) encodes policies in human-
readable computer programs. Novel algorithms were recently introduced with the goal of …

被引用次数：3 相关文章所有 2 个版本

高级搜索

QQ 群