关注
Thomas Coste
Thomas Coste
Noah's Ark Lab & University of Cambridge
在 cam.ac.uk 的电子邮件经过验证
标题
引用次数
引用次数
年份
Reward Model Ensembles Help Mitigate Overoptimization
T Coste, U Anwar, R Kirk, D Krueger
Twelfth International Conference on Learning Representations, 2023
432023
Pangu-agent: A fine-tunable generalist agent with structured reasoning
F Christianos, G Papoudakis, M Zimmer, T Coste, Z Wu, J Chen, ...
arXiv preprint arXiv:2312.14878, 2023
82023
Bayesian Reward Models for LLM Alignment
AX Yang, M Robeyns, T Coste, J Wang, H Bou-Ammar, L Aitchison
ICLR 2024 Workshop on Secure and Trustworthy Large Language Models, 2024
72024
系统目前无法执行此操作,请稍后再试。
文章 1–3