关注
David Lindner
David Lindner
Google DeepMind
在 google.com 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
S Casper, X Davies, C Shi, TK Gilbert, J Scheurer, J Rando, R Freedman, ...
arXiv preprint arXiv:2307.15217, 2023
2362023
Red-Teaming the Stable Diffusion Safety Filter
J Rando, D Paleka, D Lindner, L Heim, F Tramèr
NeurIPS ML Safety Workshop, 2022
852022
Tracr: Compiled Transformers as a Laboratory for Interpretability
D Lindner, J Kramár, M Rahtz, T McGrath, V Mikulik
Conference on Neural Information Processing Systems (NeurIPS), 2023
372023
GoSafeOpt: Scalable Safe Exploration for Global Optimization of Dynamical Systems
B Sukhija, M Turchetta, D Lindner, A Krause, S Trimpe, D Baumann
Artificial Intelligence, 103922, 2023
202023
Sensing Social Media Signals for Cryptocurrency News
J Beck, R Huang, D Lindner, T Guo, C Zhang, D Helbing, ...
Companion Proceedings of The 2019 World Wide Web Conference, 2019
192019
Active exploration for inverse reinforcement learning
D Lindner, A Krause, G Ramponi
Advances in Neural Information Processing Systems 35, 5843-5853, 2022
182022
Information Directed Reward Learning for Reinforcement Learning
D Lindner, M Turchetta, S Tschiatschek, K Ciosek, A Krause
Conference on Neural Information Processing Systems (NeurIPS), 2021
172021
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
J Rocamonde, V Montesinos, E Nava, E Perez, D Lindner
arXiv preprint arXiv:2310.12921, 2023
152023
Humans are not Boltzmann Distributions: Challenges and Opportunities for Modelling Human Feedback and Interaction in Reinforcement Learning
D Lindner, M El-Assady
Communication in Human-AI Interaction Workshop (CHAI) at IJCAI-ECAI, 2022
112022
Evaluating Frontier Models for Dangerous Capabilities
M Phuong, M Aitchison, E Catt, S Cogan, A Kaskasoli, V Krakovna, ...
arXiv preprint arXiv:2403.13793, 2024
72024
Addressing the Long-term Impact of ML Decisions via Policy Regret
D Lindner, H Heidari, A Krause
International Joint Conferences on Artificial Intelligence (IJCAI), 2021
72021
Challenges for Using Impact Regularizers to Avoid Negative Side Effects
D Lindner, K Matoba, A Meulemans
SafeAI Workshop at AAAI 2021, 2021
72021
Interactively Learning Preference Constraints in Linear Bandits
D Lindner, S Tschiatschek, K Hofmann, A Krause
International Conference on Machine Learning (ICML), 2022
62022
Learning safety constraints from demonstrations with unknown rewards
D Lindner, X Chen, S Tschiatschek, K Hofmann, A Krause
International Conference on Artificial Intelligence and Statistics, 2386-2394, 2024
52024
Topological semimetals and insulators in three-dimensional honeycomb materials
D Wawrzik, D Lindner, M Hermanns, S Trebst
Physical Review B 98 (11), 115114, 2018
52018
RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback
Y Metz, D Lindner, R Baur, D Keim, M El-Assady
Interactive Learning with Implicit Human Feedback Workshop at ICML, 2023
42023
Learning What To Do by Simulating the Past
D Lindner, R Shah, P Abbeel, A Dragan
International Conference on Learning Representations (ICLR), 2021
42021
Detecting Spiky Corruption in Markov Decision Processes
J Mancuso, T Kisielewski, D Lindner, A Singh
Workshop on Artificial Intelligence Safety at IJCAI 2019, 2019
22019
Algorithmic Foundations for Safe and Efficient Reinforcement Learning from Human Feedback
D Lindner
ETH Zurich, 2023
2023
系统目前无法执行此操作,请稍后再试。
文章 1–19