A Panickssery,
SR Bowman,
S Feng - arXiv preprint arXiv:2404.13076, 2024 - arxiv.org
Self-evaluation using large language models (LLMs) has proven valuable not only in
benchmarking but also methods like reward modeling, constitutional AI, and self-refinement …