关注
Logan Graham
Logan Graham
Anthropic
在 anthropic.com 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Toward trustworthy AI development: mechanisms for supporting verifiable claims
M Brundage, S Avin, J Wang, H Belfield, G Krueger, G Hadfield, H Khlaaf, ...
arXiv preprint arXiv:2004.07213, 2020
3592020
Sleeper agents: Training deceptive llms that persist through safety training
E Hubinger, C Denison, J Mu, M Lambert, M Tong, M MacDiarmid, ...
arXiv preprint arXiv:2401.05566, 2024
312024
Multiverse: causal reasoning using importance sampling in probabilistic programming
Y Perov, L Graham, K Gourgoulias, J Richens, C Lee, A Baker, S Johri
Symposium on advances in approximate bayesian inference, 1-36, 2020
252020
Inferring work task Automatability from AI expert evidence
P Duckworth, L Graham, M Osborne
Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, 485-491, 2019
172019
Copy, paste, infer: a robust analysis of twin networks for counterfactual inference
L Graham, CM Lee, Y Perov
NeurIPS19 CausalML workshop, 2019
42019
Causal Reasoning and Counterfactual Probabilistic Programming Framework Using Approximate Inference
I Perov, LCS Graham, K Gourgoulias, JG Richens, CM Lee, AP Baker, ...
US Patent App. 16/944,512, 2021
12021
Interpretable causal systems: interpretability and causality in machine learning for human and nonhuman decision-making
L Graham
University of Oxford, 2020
2020
系统目前无法执行此操作,请稍后再试。
文章 1–7