Boyi Liu 个人学术档案 - 学术资源搜索

引用次数

	总计	2019 年至今
引用	306	306
h 指数	7	7
i10 指数	5	5

0

80

40

2019202020212022202320246 38 61 58 78 65

开放获取的出版物数量

5 篇文章

0 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

Zhaoran WangAssistant Professor at Northwestern University在 northwestern.edu 的电子邮件经过验证
Zhuoran YangYale University在 yale.edu 的电子邮件经过验证
Qi CaiNorthwestern University在 u.northwestern.edu 的电子邮件经过验证
Jiayang LiPh.D. Candidate, Northwestern University在 u.northwestern.edu 的电子邮件经过验证

Boyi Liu

Boyi Liu

Northwestern University

在 u.northwestern.edu 的电子邮件经过验证

Reinforcement Learning Machine Learning Optimization


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Neural trust region/proximal policy optimization attains globally optimal policy B Liu, Q Cai, Z Yang, Z Wang Advances in neural information processing systems 32, 2019	207	2019
Reason for future, act for now: A principled framework for autonomous llm agents with provable sample efficiency Z Liu, H Hu, S Zhang, H Guo, S Ke, B Liu, Z Wang arXiv preprint arXiv:2309.17382, 2023	21*	2023
Off-policy evaluation and learning from logged bandit feedback: Error reduction via surrogate policy Y Xie, B Liu, Q Liu, Z Wang, Y Zhou, J Peng International Conference on Learning Representations, 2018	20	2018
Inducing Equilibria via Incentives: Simultaneous Design-and-Play Ensures Global Convergence B Liu, J Li, Z Yang, HT Wai, M Hong, Y Nie, Z Wang Advances in Neural Information Processing Systems, 2022	15*	2022
Differentiable bilevel programming for stackelberg congestion games J Li, J Yu, Q Wang, B Liu, Z Wang, YM Nie arXiv preprint arXiv:2209.07618, 2022	13	2022
Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL F Zhang, B Liu, K Wang, VYF Tan, Z Yang, Z Wang Advances in Neural Information Processing Systems, 2022	8	2022
An analysis of attention via the lens of exchangeability and latent variable models Y Zhang, B Liu, Q Cai, L Wang, Z Wang arXiv preprint arXiv:2212.14852, 2022	7	2022
Let models speak ciphers: Multiagent debate through embeddings C Pham, B Liu, Y Yang, Z Chen, T Liu, J Yuan, BA Plummer, Z Wang, ... arXiv preprint arXiv:2310.06272, 2023	5	2023
Policy Optimization in Zero-Sum Markov Games: Fictitious Self-Play Provably Attains Nash Equilibria B Liu, Z Yang, Z Wang	3	2020
Provably mitigating overoptimization in rlhf: Your sft loss is implicitly an adversarial regularizer Z Liu, M Lu, S Zhang, B Liu, H Guo, Y Yang, J Blanchet, Z Wang arXiv preprint arXiv:2405.16436, 2024	2	2024
Model-based reparameterization policy gradient methods: Theory and practical algorithms S Zhang, B Liu, Z Wang, T Zhao Advances in Neural Information Processing Systems 36, 2024	2	2024
Achieving hierarchy-free approximation for bilevel programs with equilibrium constraints J Li, J Yu, B Liu, Y Nie, Z Wang International Conference on Machine Learning, 20312-20335, 2023	2	2023
Differentiable Arbitrating in Zero-sum Markov Games J Wang, M Song, F Gao, B Liu, Z Wang, Y Wu International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2023	1	2023
-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model Y Zhang, L Chen, B Liu, Y Yang, Q Cui, Y Tao, H Yang arXiv preprint arXiv:2403.07191, 2024		2024
Double duality: variational primal-dual policy optimization for constrained reinforcement learning Z Li, B Liu, Z Yang, Z Wang, M Wang Journal of Machine Learning Research 24 (385), 1-43, 2023		2023
BooVI: provably efficient bootstrapped value iteration B Liu, Q Cai, Z Yang, Z Wang Advances in Neural Information Processing Systems 34, 7041-7053, 2021		2021

系统目前无法执行此操作，请稍后再试。

文章 1–16