Reinforcement learning with multiple experts: A bayesian model combination approach M Gimelfarb, S Sanner, CG Lee Advances in Neural Information Processing Systems (NeurIPS) 31, 9528-9538, 2018 | 29 | 2018 |
ε-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement Learning M Gimelfarb, S Sanner, CG Lee Uncertainty in Artificial Intelligence (UAI-19) 35, 476-485, 2019 | 26 | 2019 |
Risk-Aware Transfer in Reinforcement Learning using Successor Features M Gimelfarb, A Barreto, S Sanner, CG Lee Advances in Neural Information Processing Systems (NeurIPS) 34, 2021 | 17 | 2021 |
pyRDDLGym: From RDDL to Gym Environments A Taitler, M Gimelfarb, J Jeong, S Gopalakrishnan, M Mladenov, X Liu, ... PRL Workshop – Bridging the Gap Between AI Planning and Reinforcement Learning, 2023 | 9 | 2023 |
Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization J Jeong, X Wang, M Gimelfarb, H Kim, B Abdulhai, S Sanner International Conference on Learning Representations (ICLR), 2023 | 8 | 2023 |
Contextual policy transfer in reinforcement learning domains via deep mixtures-of-experts M Gimelfarb, S Sanner, CG Lee Uncertainty in Artificial Intelligence (UAI-21) 37, 1787-1797, 2021 | 6* | 2021 |
A Distributional Framework for Risk-Sensitive End-to-End Planning in Continuous MDPs N Patton, J Jeong, M Gimelfarb, S Sanner AAAI Conference on Artificial Intelligence (AAAI) 6 (9), 9894-9901, 2022 | 5* | 2022 |
The 2023 International Planning Competition A Taitler, R Alford, J Espasa, G Behnke, D Fišer, M Gimelfarb, ... AI Magazine, 2024 | 2 | 2024 |
Bayesian Experience Reuse for Learning from Multiple Demonstrators M Gimelfarb, S Sanner, CG Lee International Joint Conference on Artificial Intelligence (IJCAI) 30, 2021 | 2 | 2021 |
JaxPlan and GurobiPlan: Optimization Baselines for Replanning in Discrete and Mixed Discrete-Continuous Probabilistic Domains M Gimelfarb, A Taitler, S Sanner Proceedings of the International Conference on Automated Planning and …, 2024 | | 2024 |
Constraint-Generation Policy Optimization (CGPO): Nonlinear Programming for Policy Optimization in Mixed Discrete-Continuous MDPs M Gimelfarb, A Taitler, S Sanner arXiv preprint arXiv:2401.12243, 2024 | | 2024 |
Thompson Sampling for Parameterized Markov Decision Processes with Uninformative Actions M Gimelfarb, MJ Kim arXiv preprint arXiv:2305.07844, 2023 | | 2023 |
Who Should I Trust?: Uncertainty and Risk for Knowledge Transfer from Multiple Sources in Reinforcement Learning Domains M Gimelfarb University of Toronto (Canada), 2023 | | 2023 |
Distributional Reward Shaping: Point Estimates Are All You Need M Gimelfarb, S Sanner, CG Lee The Multi-disciplinary Conference on Reinforcement Learning and Decision …, 2022 | | 2022 |
End-to-End Risk-Aware Planning by Gradient Descent N Patton, J Jeong, M Gimelfarb, S Sanner PRL Workshop – Bridging the Gap Between AI Planning and Reinforcement Learning, 2021 | | 2021 |
Thompson Sampling for the Control of a Queue with Demand Uncertainty M Gimelfarb University of Toronto (Canada), 2017 | | 2017 |