Active Measure Reinforcement Learning for Observation Cost Minimization.

S Holt, A Hüyük… - Advances in Neural …, 2024 - proceedings.neurips.cc

The control of continuous-time environments while actively deciding when to take costly
observations in time is a crucial yet unexplored problem, particularly relevant to real-world …

被引用次数：5 相关文章所有 5 个版本

[PDF] neurips.cc

Learning long-term crop management strategies with cyclesgym

M Turchetta, L Corinzia, S Sussex… - Advances in neural …, 2022 - proceedings.neurips.cc

To improve the sustainability and resilience of modern food systems, designing improved
crop management strategies is crucial. The increasing abundance of data on agricultural …

被引用次数：10 相关文章所有 7 个版本

[PDF] neurips.cc

Reinforcement learning with state observation costs in action-contingent noiselessly observable markov decision processes

HJA Nam, S Fleming… - Advances in Neural …, 2021 - proceedings.neurips.cc

Many real-world problems that require making optimal sequences of decisions under
uncertainty involve costs when the agent wishes to obtain information about its environment …

被引用次数：18 相关文章所有 5 个版本

[PDF] arxiv.org

Automated scientific discovery: from equation discovery to autonomous discovery systems

S Kramer, M Cerrato, S Džeroski, R King - arXiv preprint arXiv:2305.02251, 2023 - arxiv.org

The paper surveys automated scientific discovery, from equation discovery and symbolic
regression to autonomous discovery systems and agents. It discusses the individual …

被引用次数：7 相关文章所有 2 个版本

[PDF] aaai.org

Act-then-measure: reinforcement learning for partially observable environments with active measuring

M Krale, TD Simão, N Jansen - Proceedings of the International …, 2023 - ojs.aaai.org

We study Markov decision processes (MDPs), where agents control when and how they
gather information, as formalized by action-contingent noiselessly observable MDPs (ACNO …

被引用次数：5 相关文章所有 8 个版本

[PDF] arxiv.org

Dynamic observation policies in observation cost-sensitive reinforcement learning

C Bellinger, M Crowley, I Tamblyn - arXiv preprint arXiv:2307.02620, 2023 - arxiv.org

Reinforcement learning (RL) has been shown to learn sophisticated control policies for
complex tasks including games, robotics, heating and cooling systems and text generation …

被引用次数：3 相关文章所有 4 个版本

[PDF] arxiv.org

Mean-field games of speedy information access with observation costs

D Becherer, C Reisinger, J Tam - arXiv preprint arXiv:2309.07877, 2023 - arxiv.org

We investigate a mean-field game (MFG) in which agents can exercise control actions that
affect their speed of access to information. The agents can dynamically decide to receive …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Remote Estimation of Markov Processes over Costly Channels: On the Benefits of Implicit Information

ED Santi, T Soleymani, D Gunduz - arXiv preprint arXiv:2401.17999, 2024 - arxiv.org

In this paper, we study the remote estimation problem of a Markov process over a channel
with a cost. We formulate this problem as an infinite horizon optimization problem with two …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Monitored Markov Decision Processes

S Parisi, M Mohammedalamen, A Kazemipour… - arXiv preprint arXiv …, 2024 - arxiv.org

In reinforcement learning (RL), an agent learns to perform a task by interacting with an
environment and receiving feedback (a numerical reward) for its actions. However, the …

被引用次数：1 相关文章所有 5 个版本

[PDF] mlr.press

Time and temporal abstraction in continual learning: tradeoffs, analogies and regret in an active measuring setting

V Létourneau, C Bellinger… - … on Lifelong Learning …, 2023 - proceedings.mlr.press

This conceptual paper provides theoretical results linking notions in semi-supervised
learning (SSL) and hierarchical reinforcement learning (HRL) in the context of lifelong …

高级搜索

QQ 群