关注
James Chua
James Chua
Truthful AI
在 u.nus.edu 的电子邮件经过验证
标题
引用次数
引用次数
年份
Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
J Chua*, E Rees, H Batra, SR Bowman, J Michael, E Perez, M Turpin
arXiv preprint arXiv:2403.05518, 2024
62024
When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?
R Schaeffer, D Valentine, L Bailey, J Chua, C Eyzaguirre, Z Durante, ...
arXiv preprint arXiv:2407.15211, 2024
42024
Looking Inward: Language Models Can Learn About Themselves by Introspection
FJ Binder, J Chua*, T Korbak, H Sleight, J Hughes, R Long, E Perez, ...
arXiv preprint arXiv:2410.13787, 2024
2024
Language Models Can Articulate Their Implicit Goals
J Betley, X Bao, M Soto, A Sztyber-Betley, J Chua, O Evans
Neurips Safe Generative AI Workshop 2024, 0
系统目前无法执行此操作,请稍后再试。
文章 1–4