Action Robust Reinforcement Learning and Applications in Continuous Control C Tessler, Y Efroni, S Mannor arXiv preprint arXiv:1901.09184, 2019 | 231 | 2019 |
Adaptive trust region policy optimization: Global convergence and faster rates for regularized mdps L Shani, Y Efroni, S Mannor Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 5668-5675, 2020 | 187 | 2020 |
Exploration-exploitation in constrained mdps Y Efroni, S Mannor, M Pirotta arXiv preprint arXiv:2003.02189, 2020 | 164 | 2020 |
Optimistic policy optimization with bandit feedback Y Efroni, L Shani, A Rosenberg, S Mannor arXiv preprint arXiv:2002.08243, 2020 | 97* | 2020 |
Tight regret bounds for model-based reinforcement learning with greedy policies Y Efroni, N Merlis, M Ghavamzadeh, S Mannor Advances in Neural Information Processing Systems, 12203-12213, 2019 | 78 | 2019 |
Rl for latent mdps: Regret guarantees and a lower bound J Kwon, Y Efroni, C Caramanis, S Mannor Advances in Neural Information Processing Systems 34, 24523-24534, 2021 | 74 | 2021 |
Mirror descent policy optimization M Tomar, L Shani, Y Efroni, M Ghavamzadeh arXiv preprint arXiv:2005.09814, 2020 | 66 | 2020 |
Universality of local weak interactions and its application for interferometric alignment J Dziewior, L Knips, D Farfurnik, K Senkalla, N Benshalom, J Efroni, ... Proceedings of the National Academy of Sciences 116 (8), 2881-2890, 2019 | 55 | 2019 |
Provably filtering exogenous distractors using multistep inverse dynamics Y Efroni, D Misra, A Krishnamurthy, A Agarwal, J Langford International Conference on Learning Representations, 2021 | 49* | 2021 |
Beyond the one-step greedy approach in reinforcement learning Y Efroni, G Dalal, B Scherrer, S Mannor International Conference on Machine Learning, 1387-1396, 2018 | 45 | 2018 |
Reinforcement learning with trajectory feedback Y Efroni, N Merlis, S Mannor Proceedings of the AAAI conference on artificial intelligence 35 (8), 7288-7295, 2021 | 41 | 2021 |
Multiple-step greedy policies in approximate and online reinforcement learning Y Efroni, G Dalal, B Scherrer, S Mannor Advances in neural information processing systems 31, 2018 | 40 | 2018 |
Provable reinforcement learning with a short-term memory Y Efroni, C Jin, A Krishnamurthy, S Miryoosefi International Conference on Machine Learning, 5832-5850, 2022 | 39 | 2022 |
How to combine tree-search methods in reinforcement learning Y Efroni, G Dalal, B Scherrer, S Mannor Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 3494-3501, 2019 | 35 | 2019 |
Guaranteed discovery of control-endogenous latent states with multi-step inverse models A Lamb, R Islam, Y Efroni, A Didolkar, D Misra, D Foster, L Molu, R Chari, ... arXiv preprint arXiv:2207.08229, 2022 | 33* | 2022 |
Minimax regret for stochastic shortest path A Cohen, Y Efroni, Y Mansour, A Rosenberg Advances in neural information processing systems 34, 28350-28361, 2021 | 28 | 2021 |
Bandits with partially observable offline data G Tennenholtz, U Shalit, S Mannor, Y Efroni arXiv preprint arXiv:2006.06731, 2020 | 28* | 2020 |
Sample-efficient reinforcement learning in the presence of exogenous information Y Efroni, DJ Foster, D Misra, A Krishnamurthy, J Langford Conference on Learning Theory, 5062-5127, 2022 | 23 | 2022 |
Online planning with lookahead policies Y Efroni, M Ghavamzadeh, S Mannor Advances in Neural Information Processing Systems 33, 14024-14033, 2020 | 21* | 2020 |
Exploration conscious reinforcement learning revisited L Shani, Y Efroni, S Mannor International conference on machine learning, 5680-5689, 2019 | 21 | 2019 |