H Hu, D Sadigh - International Conference on Machine …, 2023 - proceedings.mlr.press
One of the fundamental quests of AI is to produce agents that coordinate well with humans. This problem is challenging, especially in domains that lack high quality human behavioral …
When interacting with people, AI agents do not just influence the state of the world--they also influence the actions people take in response to the agent, and even their underlying …
Learning policies via preference-based reward learning is an increasingly popular method for customizing agent behavior, but has been shown anecdotally to be prone to spurious …
The concept of rationality is central to the field of artificial intelligence. Whether we are seeking to simulate human reasoning, or the goal is to achieve bounded optimality, we …
There is a recent trend of applying multi-agent reinforcement learning (MARL) to train an agent that can cooperate with humans in a zero-shot fashion without using any human data …
Reinforcement Learning from Human Feedback (RLHF) is a powerful paradigm for aligning foundation models to human values and preferences. However, current RLHF techniques …
The dominant practice of AI alignment assumes (1) that preferences are an adequate representation of human values,(2) that human rationality can be understood in terms of …
Assistive agents should make humans' lives easier. Classically, such assistance is studied through the lens of inverse reinforcement learning, where an assistive agent (eg, a chatbot …
G Chen, X Li, C Sun, H Wang - arXiv preprint arXiv:2310.00817, 2023 - arxiv.org
As artificial intelligence (AI) systems play an increasingly prominent role in human decision- making, challenges surface in the realm of human-AI interactions. One challenge arises from …