Rewarded soups: towards pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

A Rame, G Couairon, C Dancette… - Advances in …, 2024 - proceedings.neurips.cc
Foundation models are first pre-trained on vast unsupervised datasets and then fine-tuned
on labeled data. Reinforcement learning, notably from human feedback (RLHF), can further …

A practical guide to multi-objective reinforcement learning and planning

CF Hayes, R Rădulescu, E Bargiacchi… - Autonomous Agents and …, 2022 - Springer
Real-world sequential decision-making tasks are generally complex, requiring trade-offs
between multiple, often conflicting, objectives. Despite this, the majority of research in …

Multi-objective multi-agent decision making: a utility-based analysis and survey

R Rădulescu, P Mannion, DM Roijers… - Autonomous Agents and …, 2020 - Springer
The majority of multi-agent system implementations aim to optimise agents' policies with
respect to a single objective, despite the fact that many real-world problem domains are …

Multi-objective deep reinforcement learning

H Mossalam, YM Assael, DM Roijers… - arXiv preprint arXiv …, 2016 - arxiv.org
We propose Deep Optimistic Linear Support Learning (DOL) to solve high-dimensional multi-
objective decision problems where the relative importances of the objectives are not known …

Human-aligned artificial intelligence is a multiobjective problem

P Vamplew, R Dazeley, C Foale, S Firmin… - Ethics and information …, 2018 - Springer
As the capabilities of artificial intelligence (AI) systems improve, it becomes important to
constrain their actions to ensure their behaviour remains beneficial to humanity. A variety of …

A multi-objective deep reinforcement learning framework

TT Nguyen, ND Nguyen, P Vamplew… - … Applications of Artificial …, 2020 - Elsevier
This paper introduces a new scalable multi-objective deep reinforcement learning (MODRL)
framework based on deep Q-networks. We develop a high-performance MODRL framework …

MO-MIX: Multi-objective multi-agent cooperative decision-making with deep reinforcement learning

T Hu, B Luo, C Yang, T Huang - IEEE Transactions on Pattern …, 2023 - ieeexplore.ieee.org
Deep reinforcement learning (RL) has been applied extensively to solve complex decision-
making problems. In many real-world scenarios, tasks often have several conflicting …

Self-improving system integration: Mastering continuous change

K Bellman, J Botev, A Diaconescu, L Esterle… - Future Generation …, 2021 - Elsevier
The research initiative “self-improving system integration”(SISSY) was established with the
goal to master the ever-changing demands of system organisation in the presence of …

[图书][B] Self-adaptation for individual self-aware computing systems

M Maggio, T Abdelzaher, L Esterle, H Giese… - 2017 - Springer
This chapter discusses the role of self-awareness for adaptation at the individual level, when
one single entity receives inputs both from itself or some of its components and from the …

Autonomy and intelligence in the computing continuum: Challenges, enablers, and future directions for orchestration

H Kokkonen, L Lovén, NH Motlagh, A Kumar… - arXiv preprint arXiv …, 2022 - arxiv.org
Future AI applications require performance, reliability and privacy that the existing, cloud-
dependant system architectures cannot provide. In this article, we study orchestration in the …