[HTML][HTML] Advances and challenges in conversational recommender systems: A survey

C Gao, W Lei, X He, M de Rijke, TS Chua - AI open, 2021 - Elsevier
Recommender systems exploit interaction history to estimate user preference, having been
heavily used in a wide range of industry applications. However, static recommendation …

Autonomous agents modelling other agents: A comprehensive survey and open problems

SV Albrecht, P Stone - Artificial Intelligence, 2018 - Elsevier
Much research in artificial intelligence is concerned with the development of autonomous
agents that can interact effectively with other agents. An important aspect of such agents is …

How to design AI for social good: Seven essential factors

L Floridi, J Cowls, TC King, M Taddeo - Ethics, Governance, and Policies …, 2021 - Springer
Abstract The idea of Artificial Intelligence for Social Good (henceforth AI4SG) is gaining
traction within information societies in general and the AI community in particular. It has the …

A generalized algorithm for multi-objective reinforcement learning and policy adaptation

R Yang, X Sun, K Narasimhan - Advances in neural …, 2019 - proceedings.neurips.cc
We introduce a new algorithm for multi-objective reinforcement learning (MORL) with linear
preferences, with the goal of enabling few-shot adaptation to new tasks. In MORL, the aim is …

Multiobjective reinforcement learning: A comprehensive overview

C Liu, X Xu, D Hu - IEEE Transactions on Systems, Man, and …, 2014 - ieeexplore.ieee.org
Reinforcement learning (RL) is a powerful paradigm for sequential decision-making under
uncertainties, and most RL algorithms aim to maximize some numerical value which …

[图书][B] Probabilistic graphical models: principles and techniques

D Koller, N Friedman - 2009 - books.google.com
A general framework for constructing and using probabilistic models of complex systems that
would enable a computer to use available information for making decisions. Most tasks …

A survey of point-based POMDP solvers

G Shani, J Pineau, R Kaplow - Autonomous Agents and Multi-Agent …, 2013 - Springer
The past decade has seen a significant breakthrough in research on solving partially
observable Markov decision processes (POMDPs). Where past solvers could not scale …

[PDF][PDF] An MDP-based recommender system.

G Shani, D Heckerman, RI Brafman… - Journal of machine …, 2005 - jmlr.org
Typical recommender systems adopt a static view of the recommendation process and treat
it as a prediction problem. We argue that it is more appropriate to view the problem of …

Potential-based reward shaping for finite horizon online POMDP planning

A Eck, LK Soh, S Devlin, D Kudenko - Autonomous Agents and Multi-Agent …, 2016 - Springer
In this paper, we address the problem of suboptimal behavior during online partially
observable Markov decision process (POMDP) planning caused by time constraints on …

Consequences of misaligned AI

S Zhuang, D Hadfield-Menell - Advances in Neural …, 2020 - proceedings.neurips.cc
AI systems often rely on two key components: a specified goal or reward function and an
optimization algorithm to compute the optimal behavior for that goal. This approach is …