On First-Order Meta-Learning Algorithms A Nichol, J Achiam, J Schulman arXiv preprint arXiv:1803.02999, 2018 | 2327 | 2018 |
Constrained Policy Optimization J Achiam, D Held, A Tamar, P Abbeel arXiv preprint arXiv:1705.10528, 2017 | 1435 | 2017 |
Benchmarking Safe Exploration in Deep Reinforcement Learning A Ray, J Achiam, D Amodei https://cdn.openai.com/safexp-short.pdf, 2019 | 387 | 2019 |
Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning J Achiam, S Sastry arXiv preprint arXiv:1703.01732, 2017 | 270 | 2017 |
Spinning Up in Deep Reinforcement Learning J Achiam https://spinningup.openai.com, 0 | 256 | |
Responsive safety in reinforcement learning by pid lagrangian methods A Stooke, J Achiam, P Abbeel International Conference on Machine Learning, 9133-9143, 2020 | 245 | 2020 |
Variational Option Discovery Algorithms J Achiam, H Edwards, D Amodei, P Abbeel arXiv preprint arXiv:1807.10299, 2018 | 189 | 2018 |
Towards Characterizing Divergence in Deep Q-Learning J Achiam, E Knight, P Abbeel arXiv preprint arXiv:1903.08894, 2019 | 109 | 2019 |
A hazard analysis framework for code synthesis large language models H Khlaaf, P Mishkin, J Achiam, G Krueger, M Brundage arXiv preprint arXiv:2207.14157, 2022 | 18 | 2022 |
Advanced Policy Gradient Methods J Achiam Lecture [online] http://rail.eecs.berkeley.edu/deeprlcourse-fa17/f17docs …, 2017 | 5 | 2017 |
Exploration and safety in deep reinforcement learning JS Achiam University of California, Berkeley, 2021 | 3 | 2021 |
Variational Option Discovery Algorithms By admin No Comments J Achiam, D Amodei, H Edwards, P Abbeel | | |
Training Dynamics Models for Accurate Long-Horizon Prediction E Knight, J Achiam, UC OpenAI | | |