A POMDP formulation of preference elicitation problems

C Gao, W Lei, X He, M de Rijke, TS Chua - AI open, 2021 - Elsevier

Recommender systems exploit interaction history to estimate user preference, having been
heavily used in a wide range of industry applications. However, static recommendation …

被引用次数：256 相关文章所有 8 个版本

[PDF] sciencedirect.com

Autonomous agents modelling other agents: A comprehensive survey and open problems

SV Albrecht, P Stone - Artificial Intelligence, 2018 - Elsevier

Much research in artificial intelligence is concerned with the development of autonomous
agents that can interact effectively with other agents. An important aspect of such agents is …

被引用次数：557 相关文章所有 10 个版本

[PDF] springer.com

How to design AI for social good: Seven essential factors

L Floridi, J Cowls, TC King, M Taddeo - Ethics, Governance, and Policies …, 2021 - Springer

Abstract The idea of Artificial Intelligence for Social Good (henceforth AI4SG) is gaining
traction within information societies in general and the AI community in particular. It has the …

被引用次数：329 相关文章所有 15 个版本

[PDF] neurips.cc

A generalized algorithm for multi-objective reinforcement learning and policy adaptation

R Yang, X Sun, K Narasimhan - Advances in neural …, 2019 - proceedings.neurips.cc

We introduce a new algorithm for multi-objective reinforcement learning (MORL) with linear
preferences, with the goal of enabling few-shot adaptation to new tasks. In MORL, the aim is …

被引用次数：265 相关文章所有 11 个版本

Multiobjective reinforcement learning: A comprehensive overview

C Liu, X Xu, D Hu - IEEE Transactions on Systems, Man, and …, 2014 - ieeexplore.ieee.org

Reinforcement learning (RL) is a powerful paradigm for sequential decision-making under
uncertainties, and most RL algorithms aim to maximize some numerical value which …

被引用次数：412 相关文章所有 3 个版本

[PDF] psl.eu

[图书][B] Probabilistic graphical models: principles and techniques

D Koller, N Friedman - 2009 - books.google.com

A general framework for constructing and using probabilistic models of complex systems that
would enable a computer to use available information for making decisions. Most tasks …

被引用次数：11134 相关文章所有 13 个版本

[PDF] psu.edu

A survey of point-based POMDP solvers

G Shani, J Pineau, R Kaplow - Autonomous Agents and Multi-Agent …, 2013 - Springer

The past decade has seen a significant breakthrough in research on solving partially
observable Markov decision processes (POMDPs). Where past solvers could not scale …

被引用次数：753 相关文章所有 12 个版本

[PDF] jmlr.org

[PDF][PDF] An MDP-based recommender system.

G Shani, D Heckerman, RI Brafman… - Journal of machine …, 2005 - jmlr.org

Typical recommender systems adopt a static view of the recommendation process and treat
it as a prediction problem. We argue that it is more appropriate to view the problem of …

被引用次数：1427 相关文章所有 25 个版本

Potential-based reward shaping for finite horizon online POMDP planning

A Eck, LK Soh, S Devlin, D Kudenko - Autonomous Agents and Multi-Agent …, 2016 - Springer

In this paper, we address the problem of suboptimal behavior during online partially
observable Markov decision process (POMDP) planning caused by time constraints on …

被引用次数：180 相关文章所有 7 个版本

[PDF] neurips.cc

Consequences of misaligned AI

S Zhuang, D Hadfield-Menell - Advances in Neural …, 2020 - proceedings.neurips.cc

AI systems often rely on two key components: a specified goal or reward function and an
optimization algorithm to compute the optimal behavior for that goal. This approach is …

被引用次数：69 相关文章所有 5 个版本

高级搜索

QQ 群