Explainable AI over the Internet of Things (IoT): Overview, state-of-the-art and future directions

SK Jagatheesaperumal, QV Pham… - IEEE Open Journal …, 2022 - ieeexplore.ieee.org
Explainable Artificial Intelligence (XAI) is transforming the field of Artificial Intelligence (AI) by
enhancing the trust of end-users in machines. As the number of connected devices keeps on …

A state‐of‐the‐art review of optimal reservoir control for managing conflicting demands in a changing world

M Giuliani, JR Lamontagne, PM Reed… - Water Resources …, 2021 - Wiley Online Library
The state of the art for optimal water reservoir operations is rapidly evolving, driven by
emerging societal challenges. Changing values for balancing environmental resources …

Rewarded soups: towards pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

A Rame, G Couairon, C Dancette… - Advances in …, 2024 - proceedings.neurips.cc
Foundation models are first pre-trained on vast unsupervised datasets and then fine-tuned
on labeled data. Reinforcement learning, notably from human feedback (RLHF), can further …

Multi-objective gflownets

M Jain, SC Raparthy… - International …, 2023 - proceedings.mlr.press
We study the problem of generating diverse candidates in the context of Multi-Objective
Optimization. In many applications of machine learning such as drug discovery and material …

Personalized soups: Personalized large language model alignment via post-hoc parameter merging

J Jang, S Kim, BY Lin, Y Wang, J Hessel… - arXiv preprint arXiv …, 2023 - arxiv.org
While Reinforcement Learning from Human Feedback (RLHF) aligns Large Language
Models (LLMs) with general, aggregate human preferences, it is suboptimal for learning …

Scalar reward is not enough: A response to silver, singh, precup and sutton (2021)

P Vamplew, BJ Smith, J Källström, G Ramos… - Autonomous Agents and …, 2022 - Springer
The recent paper “Reward is Enough” by Silver, Singh, Precup and Sutton posits that the
concept of reward maximisation is sufficient to underpin all intelligence, both natural and …

Goals, usefulness and abstraction in value-based choice

B De Martino, A Cortese - Trends in Cognitive Sciences, 2023 - cell.com
Abstract Colombian drug lord Pablo Escobar, while on the run, purportedly burned two
million dollars in banknotes to keep his daughter warm. A stark reminder that, in life …

A review of deep reinforcement learning approaches for smart manufacturing in industry 4.0 and 5.0 framework

A del Real Torres, DS Andreiana, Á Ojeda Roldán… - Applied Sciences, 2022 - mdpi.com
In this review, the industry's current issues regarding intelligent manufacture are presented.
This work presents the status and the potential for the I4. 0 and I5. 0's revolutionary …

Optimistic linear support and successor features as a basis for optimal policy transfer

LN Alegre, A Bazzan… - … conference on machine …, 2022 - proceedings.mlr.press
In many real-world applications, reinforcement learning (RL) agents might have to solve
multiple tasks, each one typically modeled via a reward function. If reward functions are …

Promptable behaviors: Personalizing multi-objective rewards from human preferences

M Hwang, L Weihs, C Park, K Lee… - Proceedings of the …, 2024 - openaccess.thecvf.com
Customizing robotic behaviors to be aligned with diverse human preferences is an
underexplored challenge in the field of embodied AI. In this paper we present Promptable …