Yunhao Tang 个人学术档案 - 学术资源搜索

引用次数

	总计	2019 年至今
引用	2108	2101
h 指数	19	19
i10 指数	28	28

1200

600

300

900

20182019202020212022202320246 39 127 190 237 359 1141

开放获取的出版物数量

查看全部

4 篇文章

0 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

Rémi MunosGoogle DeepMind在 inria.fr 的电子邮件经过验证
Michal ValkoLlama @ Meta Paris & Inria & MVA - Ex: Gemini and BYOL @ Google DeepMind在 meta.com 的电子邮件经过验证
Krzysztof ChoromanskiGoogle Brain Robotics New York & Columbia University在 columbia.edu 的电子邮件经过验证
Mark RowlandResearch Scientist, Google DeepMind在 google.com 的电子邮件经过验证
Aldo PacchianoBroad Institute of MIT and Harvard在 broadinstitute.org 的电子邮件经过验证
Will DabneyDeepMind在 google.com 的电子邮件经过验证
Shipra AgrawalColumbia university在 columbia.edu 的电子邮件经过验证
Tamás SarlósGoogle在 google.com 的电子邮件经过验证
Vikas SindhwaniGoogle DeepMind Robotics在 google.com 的电子邮件经过验证
Tadashi KozunoOMRON SINIC X在 alumni.oist.jp 的电子邮件经过验证
Wenbo GaoColumbia University在 columbia.edu 的电子邮件经过验证
Florent AltchéResearch Engineer, DeepMind在 google.com 的电子邮件经过验证
Yuri FaenzaAssociate Professor, IEOR, Columbia University在 columbia.edu 的电子邮件经过验证
Alp KucukelbirAdjunct Professor of Computer Science, Columbia University在 cs.columbia.edu 的电子邮件经过验证
Adrian WellerDirector of Research, Machine Learning, University of Cambridge在 eng.cam.ac.uk 的电子邮件经过验证
Anna ChoromanskaNew York University在 nyu.edu 的电子邮件经过验证
Michael I. JordanProfessor of Electrical Engineering and Computer Sciences and Professor of Statistics, UC Berkeley在 cs.berkeley.edu 的电子邮件经过验证
Jiri HronResearch Scientist, Google DeepMind在 google.com 的电子邮件经过验证
Steven KapturowskiDeepMind在 google.com 的电子邮件经过验证
David AbelResearch Scientist, DeepMind在 deepmind.com 的电子邮件经过验证

关注

Yunhao Tang

Research Scientist, DeepMind

在 columbia.edu 的电子邮件经过验证 - 首页

Reinforcement Learning


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023	843	2023
Reinforcement learning for integer programming: Learning to cut Y Tang, S Agrawal, Y Faenza International conference on machine learning, 9367-9376, 2020	203	2020
Es-maml: Simple hessian-free meta learning X Song, W Gao, Y Yang, K Choromanski, A Pacchiano, Y Tang arXiv preprint arXiv:1910.01215, 2019	131	2019
Discretizing continuous action space for on-policy optimization Y Tang, S Agrawal Proceedings of the aaai conference on artificial intelligence 34 (04), 5981-5988, 2020	119	2020
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024	81	2024
Monte-Carlo tree search as regularized policy optimization JB Grill, F Altché, Y Tang, T Hubert, M Valko, I Antonoglou, R Munos International Conference on Machine Learning, 3769-3778, 2020	70	2020
Byol-explore: Exploration by bootstrapped prediction Z Guo, S Thakoor, M Pîslar, B Avila Pires, F Altché, C Tallec, A Saade, ... Advances in neural information processing systems 35, 31855-31870, 2022	57	2022
From complexity to simplicity: Adaptive es-active subspaces for blackbox optimization KM Choromanski, A Pacchiano, J Parker-Holder, Y Tang, V Sindhwani Advances in Neural Information Processing Systems 32, 2019	50	2019
Orthogonal estimation of Wasserstein distances M Rowland, J Hron, Y Tang, K Choromanski, T Sarlos, A Weller The 22nd International Conference on Artificial Intelligence and Statistics …, 2019	47	2019
Provably robust blackbox optimization for reinforcement learning K Choromanski, A Pacchiano, J Parker-Holder, Y Tang, D Jain, Y Yang, ... CoRR, abs/1903.02993, 2019	42	2019
Learning to Score Behaviors for Guided Policy Optimization A Pacchiano, J Parker-Holder, Y Tang, A Choromanska, K Choromanski, ... arXiv preprint arXiv:1906.04349, 2019	40	2019
Exploration by distributional reinforcement learning Y Tang, S Agrawal arXiv preprint arXiv:1805.01907, 2018	40	2018
Boosting trust region policy optimization by normalizing flows policy Y Tang, S Agrawal arXiv preprint arXiv:1809.10326, 2018	33	2018
Nash learning from human feedback R Munos, M Valko, D Calandriello, MG Azar, M Rowland, ZD Guo, Y Tang, ... arXiv preprint arXiv:2312.00886, 2023	31	2023
Understanding self-predictive learning for reinforcement learning Y Tang, ZD Guo, PH Richemond, BA Pires, Y Chandak, R Munos, ... International Conference on Machine Learning, 33632-33656, 2023	26	2023
Self-imitation learning via generalized lower bound q-learning Y Tang Advances in neural information processing systems 33, 13964-13975, 2020	23	2020
Hindsight expectation maximization for goal-conditioned reinforcement learning Y Tang, A Kucukelbir International Conference on Artificial Intelligence and Statistics, 2863-2871, 2021	20	2021
Revisiting Peng’s Q() for Modern Reinforcement Learning T Kozuno, Y Tang, M Rowland, R Munos, S Kapturowski, W Dabney, ... International Conference on Machine Learning, 5794-5804, 2021	19	2021
Taylor expansion policy optimization Y Tang, M Valko, R Munos International Conference on Machine Learning, 9397-9406, 2020	19	2020
Generalized Preference Optimization: A Unified Approach to Offline Alignment Y Tang, ZD Guo, Z Zheng, D Calandriello, R Munos, M Rowland, ... arXiv preprint arXiv:2402.05749, 2024	15	2024

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

上传 PDF

关注此作者

引用次数

合著作者

引用