Yi Wan 个人学术档案 - 学术资源搜索

引用次数

	总计	2019 年至今
引用	203	203
h 指数	6	6
i10 指数	6	6

2019202020212022202320242 8 26 44 70 52

开放获取的出版物数量

查看全部

4 篇文章

0 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

关注

Yi Wan

Meta

在 meta.com 的电子邮件经过验证 - 首页

reinforcement learning


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Learning and planning in average-reward markov decision processes Y Wan, A Naik, RS Sutton International Conference on Machine Learning, 10653-10662, 2021	70	2021
Average-reward off-policy policy evaluation with function approximation S Zhang, Y Wan, RS Sutton, S Whiteson international conference on machine learning, 12578-12588, 2021	37	2021
Planning with expectation models Y Wan, Z Abbas, A White, M White, RS Sutton arXiv preprint arXiv:1904.01191, 2019	29	2019
Off-policy maximum entropy reinforcement learning: Soft actor-critic with advantage weighted mixture policy (SAC-AWMP) Z Hou, K Zhang, Y Wan, D Li, C Fu, H Yu arXiv preprint arXiv:2002.02829, 2020	18	2020
Towards evaluating adaptivity of model-based reinforcement learning methods Y Wan, A Rahimi-Kalahroudi, J Rajendran, I Momennejad, S Chandar, ... International Conference on Machine Learning, 22536-22561, 2022	14	2022
Average-reward learning and planning with options Y Wan, A Naik, R Sutton Advances in Neural Information Processing Systems 34, 22758-22769, 2021	12	2021
Model-based reinforcement learning with non-linear expectation models and stochastic environments Y Wan, M Zaheer, M White, RS Sutton FAIM Workshop on Prediction and Generative Modeling in Reinforcement …, 2018	6	2018
Toward discovering options that achieve faster planning Y Wan, RS Sutton arXiv preprint arXiv:2205.12515, 2022	4	2022
On convergence of average-reward off-policy control algorithms in weakly communicating MDPs Y Wan, RS Sutton arXiv preprint arXiv:2209.15141, 2022	3	2022
Pearl: A Production-ready Reinforcement Learning Agent Z Zhu, RS Braz, J Bhandari, D Jiang, Y Wan, Y Efroni, L Wang, R Xu, ... arXiv preprint arXiv:2312.03814, 2023	2	2023
Learning and Planning with the Average-Reward Formulation Y Wan	2	2023
The Emphatic Approach to Average-Reward Policy Evaluation J He, Y Wan, AR Mahmood Deep Reinforcement Learning Workshop NeurIPS 2022, 2022	2	2022
On Convergence of Average-Reward Q-Learning in Weakly Communicating Markov Decision Processes Y Wan, H Yu, RS Sutton arXiv preprint arXiv:2408.16262, 2024	1	2024
A Note on Stability in Asynchronous Stochastic Approximation without Communication Delays H Yu, Y Wan, RS Sutton arXiv preprint arXiv:2312.15091, 2023	1	2023
Loosely consistent emphatic temporal-difference learning J He, F Che, Y Wan, AR Mahmood Uncertainty in Artificial Intelligence, 849-859, 2023	1	2023
Planning with expectation models for control K Kudashkina, Y Wan, A Naik, RS Sutton arXiv preprint arXiv:2104.08543, 2021	1	2021
Asynchronous Stochastic Approximation and Average-Reward Reinforcement Learning H Yu, Y Wan, RS Sutton arXiv preprint arXiv:2409.03915, 2024		2024
Reward Centering A Naik, Y Wan, M Tomar, RS Sutton arXiv preprint arXiv:2405.09999, 2024		2024
Discovering Options by Minimizing the Number of Composed Options to Solve Multiple Tasks Y Wan, RS Sutton
Incremental Policy Gradients for Online Reinforcement Learning Control K De Asis, A Chan, Y Wan, RS Sutton

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

上传 PDF

关注此作者

引用次数

引用