Peter Sunehag 个人学术档案

引用次数

	总计	2019 年至今
引用	3176	2757
h 指数	16	13
i10 指数	28	15

820

410

205

615

2011201220132014201520162017201820192020202120222023202420 26 45 37 64 45 51 97 165 325 456 621 808 381

开放获取的出版物数量

查看全部

12 篇文章

1 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

Marcus HutterResearcher@DeepMind & Professor at ANU在 anu.edu.au 的电子邮件经过验证
Hado van HasseltResearch Scientist, DeepMind; Honorary Professor, UCL在 google.com 的电子邮件经过验证
Mayank DaswaniGoogle在 google.com 的电子邮件经过验证
Tor LattimoreDeepMind在 google.com 的电子邮件经过验证
Alex SmolaBoson AI在 smola.org 的电子邮件经过验证
Gideon DrorProfessor of Computer Science, Academic College of Tel Aviv在 mta.ac.il 的电子邮件经过验证
Jochen TrumpfAustralian National University在 anu.edu.au 的电子邮件经过验证
S V N VishwanathanAssociate Professor of Statistics and Computer Science, Purdue University在 stat.purdue.edu 的电子邮件经过验证
Scott SannerUniversity of Toronto在 mie.utoronto.ca 的电子邮件经过验证
Bhaskara Marthi在 csail.mit.edu 的电子邮件经过验证
Joel VenessGoogle DeepMind在 google.com 的电子邮件经过验证

关注

Peter Sunehag

Google - DeepMind

在 google.com 的电子邮件经过验证

Machine Learning Reinforcement Learning Deep Learning


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Value-decomposition networks for cooperative multi-agent learning P Sunehag, G Lever, A Gruslys, WM Czarnecki, V Zambaldi, M Jaderberg, ... arXiv preprint arXiv:1706.05296, 2017	1650	2017
Deep reinforcement learning in large discrete action spaces G Dulac-Arnold, R Evans, H van Hasselt, P Sunehag, T Lillicrap, J Hunt, ... arXiv preprint arXiv:1512.07679, 2015	699	2015
Scalable evaluation of multi-agent reinforcement learning with melting pot JZ Leibo, EA Dueñez-Guzman, A Vezhnevets, JP Agapiou, P Sunehag, ... International conference on machine learning, 6187-6199, 2021	80	2021
The sample-complexity of general reinforcement learning T Lattimore, M Hutter, P Sunehag International Conference on Machine Learning, 28-36, 2013	70	2013
Learning to incentivize other learning agents J Yang, A Li, M Farajtabar, P Sunehag, E Hughes, H Zha Advances in Neural Information Processing Systems 33, 15208-15219, 2020	61	2020
Deep reinforcement learning with attention for slate markov decision processes with high-dimensional states and actions P Sunehag, R Evans, G Dulac-Arnold, Y Zwols, D Visentin, B Coppin arXiv preprint arXiv:1512.01124, 2015	54	2015
Malthusian reinforcement learning JZ Leibo, J Perolat, E Hughes, S Wheelwright, AH Marblestone, ... arXiv preprint arXiv:1812.07019, 2018	47	2018
Wearable sensor activity analysis using semi-Markov models with a grammar O Thomas, P Sunehag, G Dror, S Yun, S Kim, M Robards, A Smola, ... Pervasive and Mobile Computing 6 (3), 342-350, 2010	46	2010
Variable metric stochastic approximation theory P Sunehag, J Trumpf, SVN Vishwanathan, N Schraudolph Artificial Intelligence and Statistics, 560-566, 2009	44	2009
Reinforcement learning agents acquire flocking and symbiotic behaviour in simulated ecosystems P Sunehag, G Lever, S Liu, J Merel, N Heess, JZ Leibo, E Hughes, ... Artificial life conference proceedings, 103-110, 2019	31	2019
Value-decomposition networks for cooperative multi-agent learning. arXiv 2017 P Sunehag, G Lever, A Gruslys, WM Czarnecki, V Zambaldi, M Jaderberg, ... arXiv preprint arXiv:1706.05296, 2017	31	2017
Q-learning for history-based reinforcement learning M Daswani, P Sunehag, M Hutter Asian Conference on Machine Learning, 213-228, 2013	23	2013
Semi-markov kmeans clustering and activity recognition from body-worn sensors MW Robards, P Sunehag 2009 Ninth IEEE International Conference on Data Mining, 438-446, 2009	19	2009
Rationality, optimism and guarantees in general reinforcement learning P Sunehag, M Hutter The Journal of Machine Learning Research 16 (1), 1345-1390, 2015	18	2015
Melting Pot 2.0 JP Agapiou, AS Vezhnevets, EA Duéñez-Guzmán, J Matyas, Y Mao, ... arXiv preprint arXiv:2211.13746, 2022	17	2022
Feature reinforcement learning: state of the art M Daswani, P Sunehag, M Hutter Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014	16	2014
Adaptive context tree weighting A O'Neill, M Hutter, W Shao, P Sunehag 2012 Data Compression Conference, 317-326, 2012	16	2012
(Non-) equivalence of universal priors I Wood, P Sunehag, M Hutter Algorithmic Probability and Friends. Bayesian Prediction and Artificial …, 2013	15	2013
Optimistic agents are asymptotically optimal P Sunehag, M Hutter AI 2012: Advances in Artificial Intelligence: 25th Australasian Joint …, 2012	15	2012
Consistency of feature Markov processes P Sunehag, M Hutter Algorithmic Learning Theory, 360-374, 2010	15	2010

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

上传 PDF

关注此作者

引用次数

合著作者

引用