关注
Cassidy Laidlaw
Cassidy Laidlaw
在 berkeley.edu 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Capture, learning, and synthesis of 3D speaking styles
D Cudeiro, T Bolkart, C Laidlaw, A Ranjan, MJ Black
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019
3632019
Functional Adversarial Attacks
C Laidlaw, S Feizi
Advances in Neural Information Processing Systems 32, 2019
283*2019
Perceptual adversarial robustness: Defense against unseen threat models
C Laidlaw, S Singla, S Feizi
arXiv preprint arXiv:2006.12655, 2020
2122020
The boltzmann policy distribution: Accounting for systematic suboptimality in human models
C Laidlaw, A Dragan
arXiv preprint arXiv:2204.10759, 2022
292022
Distributional preference learning: Understanding and accounting for hidden context in RLHF
A Siththaranjan, C Laidlaw, D Hadfield-Menell
arXiv preprint arXiv:2312.08358, 2023
24*2023
Playing it safe: Adversarial robustness with an abstain option
C Laidlaw, S Feizi
arXiv preprint arXiv:1911.11253, 2019
222019
Bridging rl theory and practice with the effective horizon
C Laidlaw, SJ Russell, A Dragan
Advances in Neural Information Processing Systems 36, 58953-59007, 2023
202023
Uncertain decisions facilitate better preference learning
C Laidlaw, S Russell
Advances in Neural Information Processing Systems 34, 15070-15083, 2021
14*2021
Preventing reward hacking with occupancy measure regularization
C Laidlaw, S Singhal, A Dragan
arXiv preprint arXiv:2403.03185, 2024
112024
Toward computationally efficient inverse reinforcement learning via reward shaping
LH Cooke, H Klyne, E Zhang, C Laidlaw, M Tambe, F Doshi-Velez
arXiv preprint arXiv:2312.09983, 2023
12023
The Effective Horizon Explains Deep RL Performance in Stochastic Environments
C Laidlaw, B Zhu, S Russell, A Dragan
arXiv preprint arXiv:2312.08369, 2023
12023
Scalably Solving Assistance Games
C Laidlaw, E Bronstein, T Guo, D Feng, L Berglund, J Svegliato, S Russell, ...
ICML 2024 Workshop on Models of Human Feedback for AI Alignment, 2024
2024
Scalable Oversight by Accounting for Unreliable Feedback
S Singhal, C Laidlaw, A Dragan
ICML 2024 Workshop on Models of Human Feedback for AI Alignment, 0
系统目前无法执行此操作,请稍后再试。
文章 1–13