Behavior alignment via reward function optimization

D Gupta, Y Chandak, S Jordan… - Advances in …, 2024 - proceedings.neurips.cc
Designing reward functions for efficiently guiding reinforcement learning (RL) agents toward
specific behaviors is a complex task. This is challenging since it requires the identification of …

Comparing multi-armed bandit algorithms and Q-learning for multiagent action selection: a case study in route choice

TBF de Oliveira, ALC Bazzan… - … Joint Conference on …, 2018 - ieeexplore.ieee.org
The multi-armed bandit (MAB) problem is concerned with an agent choosing which arm of a
slot machine to play in order to optimize its reward. A family of reinforcement learning …

Reinforcement learning vs. rule-based adaptive traffic signal control: A Fourier basis linear function approximation for traffic signal control

T Ziemke, LN Alegre, ALC Bazzan - Ai Communications, 2021 - content.iospress.com
Reinforcement learning is an efficient, widely used machine learning technique that
performs well when the state and action spaces have a reasonable size. This is rarely the …

[PDF][PDF] A Reinforcement Learning Approach with Fourier Basis Linear Function Approximation for Traffic Signal Control.

T Ziemke, LN Alegre, ALC Bazzan - ATT@ ECAI, 2020 - ceur-ws.org
Reinforcement learning is an efficient, widely used machine learning technique that
performs well when the state and action spaces are reasonable. This is rarely the case …

Competitive Evolution Multi-Agent Deep Reinforcement Learning

W Zhou, Y Chen, J Li - Proceedings of the 3rd International Conference …, 2019 - dl.acm.org
As an effective method to solve the optimal policy in multi-agent systems, multi-agent deep
reinforcement learning (MADRL) has achieved impressive results in many applications …

On the Role of Reward Functions for Reinforcement Learning in the Traffic Assignment Problem

R Grunitzki, G de Oliveira Ramos - 2020 International Joint …, 2020 - ieeexplore.ieee.org
The traffic assignment problem (TAP) consists of assigning routes to road users in order to
minimize traffic congestion. Traditional methods for solving the TAP assume the existence of …

[PDF][PDF] Stand by me: Learning to keep cohesion in the navigation of heterogeneous swarms

TM Grabe, FR Inácio, LS Marcolino, DG Macharet… - lancaster.ac.uk
A robotic swarm is a particular type of Multiagent System that employs a large number of
simpler agents in order to cooperatively perform different tasks. Oftentimes, the …