Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation J He, H Zhong, Z Yang arXiv preprint arXiv:2404.12648, 2024 | 2 | 2024 |
A Mean-Field Analysis of Neural Gradient Descent-Ascent: Applications to Functional Conditional Moment Equations Y Zhu, Y Zhang, Z Wang, Z Yang, X Chen arXiv preprint arXiv:2404.12312, 2024 | | 2024 |
Unveil conditional diffusion models with classifier-free guidance: A sharp statistical theory H Fu, Z Yang, M Wang, M Chen arXiv preprint arXiv:2403.11968, 2024 | 4 | 2024 |
On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games A Altabaa, Z Yang arXiv preprint arXiv:2403.00993, 2024 | | 2024 |
Online Performative Gradient Descent for Learning Nash Equilibria in Decision-Dependent Games Z Zhu, E Fang, Z Yang Advances in Neural Information Processing Systems 36, 2024 | 2 | 2024 |
Diffusion model is an effective planner and data synthesizer for multi-task reinforcement learning H He, C Bai, K Xu, Z Yang, W Zhang, D Wang, B Zhao, X Li Advances in neural information processing systems 36, 2024 | 28 | 2024 |
Maximize to explore: One objective function fusing estimation, planning, and exploration Z Liu, M Lu, W Xiong, H Zhong, H Hu, S Zhang, S Zheng, Z Yang, Z Wang Advances in Neural Information Processing Systems 36, 2024 | 28* | 2024 |
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF H Shen, Z Yang, T Chen arXiv preprint arXiv:2402.06886, 2024 | 2 | 2024 |
Neural temporal difference and q learning provably converge to global optima Q Cai, Z Yang, JD Lee, Z Wang Mathematics of Operations Research 49 (1), 619-651, 2024 | 147* | 2024 |
Sample-Efficient Multi-Agent RL: An Optimization Perspective N Xiong, Z Liu, Z Wang, Z Yang arXiv preprint arXiv:2310.06243, 2023 | 2 | 2023 |
Online bootstrap inference for policy evaluation in reinforcement learning P Ramprasad, Y Li, Z Yang, Z Wang, WW Sun, G Cheng Journal of the American Statistical Association 118 (544), 2901-2914, 2023 | 31 | 2023 |
Understanding implicit regularization in over-parameterized single index model J Fan, Z Yang, M Yu Journal of the American Statistical Association 118 (544), 2315-2328, 2023 | 14 | 2023 |
Provably Efficient Reinforcement Learning with Linear Function Approximation C Jin, Z Yang, Z Wang, MI Jordan Mathematics of Operations Research 48 (3), 1496-1521, 2023 | 767 | 2023 |
Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks S Chen, M Wang, Z Yang arXiv preprint arXiv:2307.14085, 2023 | 2 | 2023 |
Enforcing hard constraints with soft barriers: Safe reinforcement learning in unknown stochastic environments Y Wang, SS Zhan, R Jiao, Z Wang, W Jin, Z Yang, Z Wang, C Huang, ... International Conference on Machine Learning, 36593-36604, 2023 | 28 | 2023 |
Provably efficient representation learning with tractable planning in low-rank pomdp J Guo, Z Li, H Wang, M Wang, Z Yang, X Zhang International Conference on Machine Learning, 11967-11997, 2023 | 2 | 2023 |
Learning to incentivize information acquisition: Proper scoring rules meet principal-agent model S Chen, J Wu, Y Wu, Z Yang International Conference on Machine Learning, 5194-5218, 2023 | 5 | 2023 |
A general framework for sequential decision-making under adaptivity constraints N Xiong, Z Wang, Z Yang Forty-first International Conference on Machine Learning, 2023 | 3 | 2023 |
Provably efficient generalized lagrangian policy optimization for safe multi-agent reinforcement learning D Ding, X Wei, Z Yang, Z Wang, M Jovanovic Learning for Dynamics and Control Conference, 315-332, 2023 | 6 | 2023 |
What and how does in-context learning learn? bayesian model averaging, parameterization, and generalization Y Zhang, F Zhang, Z Yang, Z Wang arXiv preprint arXiv:2305.19420, 2023 | 35 | 2023 |