Mapping out the Space of Human Feedback for Reinforcement Learning: A Conceptual Framework

Y Metz, D Lindner, R Baur, M El-Assady - arXiv preprint arXiv:2411.11761, 2024 - arxiv.org
Reinforcement Learning from Human feedback (RLHF) has become a powerful tool to fine-
tune or train agentic machine learning models. Similar to how humans interact in social …