关注
Sungsu Lim
Sungsu Lim
在 ualberta.ca 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Greedification operators for policy optimization: Investigating forward and reverse kl divergences
A Chan, H Silva, S Lim, T Kozuno, AR Mahmood, M White
Journal of Machine Learning Research 23 (253), 1-79, 2022
232022
Actor-Expert: A Framework for using Action-Value Methods in Continuous Action Spaces
S Lim, A Joseph, L Le, Y Pan, M White
NeurIPS 2018, Deep Reinforcement Learning Workshop, https://arxiv.org/abs …, 2018
21*2018
Maximizing Information Gain in Partially Observable Environments via Prediction Rewards
Y Satsangi, S Lim, S Whiteson, F Oliehoek, M White
AAMAS 2020, 2020
162020
Actor-Expert: A Framework for using Q-learning in Continuous Action Spaces
S Lim
University of Alberta, 2019
142019
Greedy actor-critic: A new conditional cross-entropy method for policy improvement
S Neumann, S Lim, A Joseph, Y Pan, A White, M White
arXiv preprint arXiv:1810.09103, 2018
42018
An Empirical and Conceptual Categorization of Value-based Exploration Methods
N Yasui, S Lim, C Linke, A White, M White
12019
Maximizing Information Gain in Partially Observable Environments via Prediction Rewards
S Lim, Y Satsangi, S Whiteson, FA Oliehoek, M White
2020
系统目前无法执行此操作,请稍后再试。
文章 1–7