- 学术资源搜索

An information-theoretic perspective on intrinsic motivation in reinforcement learning: A survey

A Aubret, L Matignon, S Hassas - Entropy, 2023 - mdpi.com

The reinforcement learning (RL) research area is very active, with an important number of
new contributions, especially considering the emergent field of deep RL (DRL). However, a …

被引用次数：47 相关文章所有 10 个版本

[PDF] mlr.press

Learning gflownets from partial episodes for improved convergence and stability

K Madan, J Rector-Brooks… - International …, 2023 - proceedings.mlr.press

Generative flow networks (GFlowNets) are a family of algorithms for training a sequential
sampler of discrete objects under an unnormalized target density and have been …

被引用次数：71 相关文章所有 7 个版本

[PDF] jmlr.org

Convex reinforcement learning in finite trials

M Mutti, R De Santi, P De Bartolomeis… - Journal of Machine …, 2023 - jmlr.org

Convex Reinforcement Learning (RL) is a recently introduced framework that generalizes
the standard RL objective to any convex (or concave) function of the state distribution …

被引用次数：14 相关文章所有 5 个版本

[PDF] arxiv.org

Thompson sampling for improved exploration in gflownets

J Rector-Brooks, K Madan, M Jain, M Korablyov… - arXiv preprint arXiv …, 2023 - arxiv.org

Generative flow networks (GFlowNets) are amortized variational inference algorithms that
treat sampling from a distribution over compositional objects as a sequential decision …

被引用次数：20 相关文章所有 3 个版本

[PDF] smu.edu.sg

Curiosity-driven and victim-aware adversarial policies

C Gong, Z Yang, Y Bai, J Shi, A Sinha, B Xu… - Proceedings of the 38th …, 2022 - dl.acm.org

Recent years have witnessed great potential in applying Deep Reinforcement Learning
(DRL) in various challenging applications, such as autonomous driving, nuclear fusion …

被引用次数：26 相关文章所有 10 个版本

[PDF] neurips.cc

Made: Exploration via maximizing deviation from explored regions

T Zhang, P Rashidinejad, J Jiao… - Advances in …, 2021 - proceedings.neurips.cc

In online reinforcement learning (RL), efficient exploration remains particularly challenging
in high-dimensional environments with sparse rewards. In low-dimensional environments …

被引用次数：49 相关文章所有 7 个版本

[PDF] mlr.press

Fast rates for maximum entropy exploration

D Tiapkin, D Belomestny… - International …, 2023 - proceedings.mlr.press

We address the challenge of exploration in reinforcement learning (RL) when the agent
operates in an unknown environment with sparse or no rewards. In this work, we study the …

被引用次数：15 相关文章所有 9 个版本

[PDF] neurips.cc

Accelerating reinforcement learning with value-conditional state entropy exploration

D Kim, J Shin, P Abbeel, Y Seo - Advances in Neural …, 2024 - proceedings.neurips.cc

A promising technique for exploration is to maximize the entropy of visited state distribution,
ie, state entropy, by encouraging uniform coverage of visited state space. While it has been …

被引用次数：17 相关文章所有 5 个版本

[PDF] academia.edu

[PDF][PDF] Structure in reinforcement learning: A survey and open problems

A Mohan, A Zhang, M Lindauer - arXiv preprint arXiv:2306.16021, 2023 - academia.edu

Reinforcement Learning (RL), bolstered by the expressive capabilities of Deep Neural
Networks (DNNs) for function approximation, has demonstrated considerable success in …

被引用次数：17 相关文章所有 2 个版本

[PDF] aaai.org

Cem: Constrained entropy maximization for task-agnostic safe exploration

Q Yang, MTJ Spaan - Proceedings of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org

In the absence of assigned tasks, a learning agent typically seeks to explore its environment
efficiently. However, the pursuit of exploration will bring more safety risks. An under-explored …

被引用次数：17 相关文章所有 7 个版本

高级搜索

QQ 群