Reinforcement learning with non-cumulative objective

文章

学术资源搜索

获得 5 条结果（用时0.02秒）

Reinforcement learning with non-cumulative objective

A survey on scheduling techniques in computing and network convergence

S Tang, Y Yu, H Wang, G Wang, W Chen… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org

The computing demand for massive applications has led to the ubiquitous deployment of
computing power. This trend results in the urgent need for higher-level computing resource …

被引用次数：10 相关文章所有 2 个版本

DHAA: Distributed heuristic action aware multi-agent path finding in high density scene

D Zhou, Z Pang, W Li - Multimedia Tools and Applications, 2024 - Springer

Multi-agent path finding (MAPF) in highly structured environments is an exciting and
complex problem. Compared with lower-density environments, the problems of agent credit …

[PDF] arxiv.org

Designing, Developing, and Validating Network Intelligence for Scaling in Service-Based Architectures based on Deep Reinforcement Learning

P Soto, M Camelo, D De Vleeschauwer… - arXiv preprint arXiv …, 2024 - arxiv.org

Automating network processes without human intervention is crucial for the complex 6G
environment. This requires zero-touch management and orchestration, the integration of …

To the Max: Reinventing Reward in Reinforcement Learning

G Veviurko, W Böhmer, M de Weerdt - arXiv preprint arXiv:2402.01361, 2024 - arxiv.org

In reinforcement learning (RL), different rewards can define the same optimal policy but
result in drastically different learning performance. For some, the agent gets stuck with a …

被引用次数：1 相关文章所有 3 个版本

[PDF] tudelft.nl

To the Max

G Veviurko, JW Böhmer, MM de Weerdt - 2024 - repository.tudelft.nl

In reinforcement learning (RL), different reward functions can define the same optimal policy
but result in drastically different learning performance. For some, the agent gets stuck with a …

高级搜索

QQ 群

Reinforcement learning with non-cumulative objective

A survey on scheduling techniques in computing and network convergence

DHAA: Distributed heuristic action aware multi-agent path finding in high density scene

Designing, Developing, and Validating Network Intelligence for Scaling in Service-Based Architectures based on Deep Reinforcement Learning

To the Max: Reinventing Reward in Reinforcement Learning

To the Max

引用