Z Sun, Y Shen, Q Zhou, H Zhang, Z Chen… - Proceedings of the 37th …, 2023 - dl.acm.org
Recent AI-assistant agents, such as ChatGPT, predominantly rely on supervised fine-tuning
(SFT) with human annotations and reinforcement learning from human feedback (RLHF) to …