- 学术资源搜索

A practical guide to multi-objective reinforcement learning and planning

CF Hayes, R Rădulescu, E Bargiacchi… - Autonomous Agents and …, 2022 - Springer

Real-world sequential decision-making tasks are generally complex, requiring trade-offs
between multiple, often conflicting, objectives. Despite this, the majority of research in …

被引用次数：265 相关文章所有 21 个版本

[PDF] aaai.org

Finite-time frequentist regret bounds of multi-agent thompson sampling on sparse hypergraphs

T Jin, HL Hsu, W Chang, P Xu - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org

We study the multi-agent multi-armed bandit (MAMAB) problem, where agents are factored
into overlapping groups. Each group represents a hyperedge, forming a hypergraph over …

被引用次数：3 相关文章所有 4 个版本

[PDF] ieee.org

Context aware control systems: An engineering applications perspective

RAC Diaz, M Ghita, D Copot, IR Birs, C Muresan… - IEEE …, 2020 - ieeexplore.ieee.org

Cyber-physical systems revolve around context awareness, empowering objective-oriented
services, products and operations based on real data. Self-aware and self-control systems …

被引用次数：26 相关文章所有 7 个版本

[PDF] neurips.cc

Statistical and computational trade-off in multi-agent multi-armed bandits

F Vannella, A Proutiere, J Jeong - Advances in Neural …, 2024 - proceedings.neurips.cc

We study the problem of regret minimization in Multi-Agent Multi-Armed Bandits (MAMABs)
where the rewards are defined through a factor graph. We derive an instance-specific regret …

被引用次数：1 相关文章所有 6 个版本

[PDF] nature.com

Multi-agent thompson sampling for bandit applications with sparse neighbourhood structures

T Verstraeten, E Bargiacchi, PJK Libin, J Helsen… - Scientific reports, 2020 - nature.com

Multi-agent coordination is prevalent in many real-world applications. However, such
coordination is challenging due to its combinatorial nature. An important observation in this …

被引用次数：26 相关文章所有 9 个版本

[PDF] mlr.press

Best arm identification in multi-agent multi-armed bandits

F Vannella, A Proutiere, J Jeong - … Conference on Machine …, 2023 - proceedings.mlr.press

We investigate the problem of best arm identification in Multi-Agent Multi-Armed Bandits
(MAMABs) where the rewards are defined through a factor graph. The objective is to find an …

被引用次数：2 相关文章所有 5 个版本

[PDF] ifaamas.org

[PDF][PDF] Deep reinforcement learning for active wake control

G Neustroev, SPE Andringa, RA Verzijlbergh… - Proceedings of the 21st …, 2022 - ifaamas.org

Wind farms suffer from so-called wake effects: when turbines are located in the wind
shadows of other turbines, their power output is substantially reduced. These losses can be …

被引用次数：9 相关文章所有 8 个版本

[PDF] acm.org

Budget allocation as a multi-agent system of contextual & continuous bandits

B Han, C Arndt - Proceedings of the 27th ACM SIGKDD Conference on …, 2021 - dl.acm.org

Budget allocation for online advertising suffers from multiple complications, including
significant delay between the initial ad impression to the call to action as well as cold-start …

被引用次数：11 相关文章

[PDF] jmlr.org

AI-Toolbox: A C++ library for reinforcement learning and planning (with Python bindings)

E Bargiacchi, DM Roijers, A Nowé - Journal of Machine Learning Research, 2020 - jmlr.org

This paper describes AI-Toolbox, a C++ software library that contains reinforcement learning
and planning algorithms, and supports both single and multi agent problems, as well as …

被引用次数：19 相关文章所有 10 个版本

[PDF] vub.be

[PDF][PDF] Cooperative Prioritized Sweeping.

E Bargiacchi, T Verstraeten, DM Roijers - AAMAS, 2021 - cris.vub.be

We present a novel model-based algorithm, Cooperative Prioritized Sweeping, for sample-
efficient learning in large multi-agent Markov decision processes. Our approach leverages …

被引用次数：14 相关文章所有 8 个版本

高级搜索

QQ 群