R Jiang,
K Chen,
X Bai, Z He, J Li, M Yang… - arXiv preprint arXiv …, 2024 - arxiv.org
The recent surge of versatile large language models (LLMs) largely depends on aligning
increasingly capable foundation models with human intentions by preference learning …