Rewarded soups: towards pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

A Rame, G Couairon, C Dancette… - Advances in …, 2024 - proceedings.neurips.cc
Foundation models are first pre-trained on vast unsupervised datasets and then fine-tuned
on labeled data. Reinforcement learning, notably from human feedback (RLHF), can further …

A practical guide to multi-objective reinforcement learning and planning

CF Hayes, R Rădulescu, E Bargiacchi… - Autonomous Agents and …, 2022 - Springer
Real-world sequential decision-making tasks are generally complex, requiring trade-offs
between multiple, often conflicting, objectives. Despite this, the majority of research in …

Monte Carlo tree search algorithms for risk-aware and multi-objective reinforcement learning

CF Hayes, M Reymond, DM Roijers, E Howley… - Autonomous Agents and …, 2023 - Springer
In many risk-aware and multi-objective reinforcement learning settings, the utility of the user
is derived from a single execution of a policy. In these settings, making decisions based on …

Exploring the landscape of trustworthy artificial intelligence: Status and challenges

G Mentzas, M Fikardos, K Lepenioti… - Intelligent Decision …, 2024 - content.iospress.com
Artificial Intelligence (AI) has pervaded everyday life, reshaping the landscape of business,
economy, and society through the alteration of interactions and connections among …

Multi-objective coordination graphs for the expected scalarised returns with generative flow models

CF Hayes, T Verstraeten, DM Roijers, E Howley… - arXiv preprint arXiv …, 2022 - arxiv.org
Many real-world problems contain multiple objectives and agents, where a trade-off exists
between objectives. Key to solving such problems is to exploit sparse dependency …

Enhancing Robotic Navigation: An Evaluation of Single and Multi-Objective Reinforcement Learning Strategies

V Young, J Hossain, N Roy - arXiv preprint arXiv:2312.07953, 2023 - arxiv.org
This study presents a comparative analysis between single-objective and multi-objective
reinforcement learning methods for training a robot to navigate effectively to an end goal …