Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP Z Zhang, J Yang, X Ji, SS Du Advances in Neural Information Processing Systems 34, 4342-4355, 2021 | 62* | 2021 |
Impact of Representation Learning in Linear Bandits J Yang, W Hu, JD Lee, SS Du International Conference on Learning Representations, 2021 | 60* | 2021 |
Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design Y Ruan, J Yang, Y Zhou Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing …, 2021 | 57 | 2021 |
Provable Model-Based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature K Dong, J Yang, T Ma Advances in Neural Information Processing Systems 34, 2021 | 40 | 2021 |
Revisiting some common practices in cooperative multi-agent reinforcement learning W Fu, C Yu, Z Xu, J Yang, Y Wu arXiv preprint arXiv:2206.07505, 2022 | 36 | 2022 |
Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased C Yu, J Gao, W Liu, B Xu, H Tang, J Yang, Y Wang, Y Wu arXiv preprint arXiv:2302.01605, 2023 | 23 | 2023 |
Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning Y Li, T Gao, J Yang, H Xu, Y Wu International Conference on Machine Learning, 12765-12781, 2022 | 17 | 2022 |
Optimal Gradient-based Algorithms for Non-concave Bandit Optimization B Huang, K Huang, S Kakade, JD Lee, Q Lei, R Wang, J Yang Advances in Neural Information Processing Systems 34, 2021 | 16 | 2021 |
Nearly Minimax Algorithms for Linear Bandits with Shared Representation J Yang, Q Lei, JD Lee, SS Du arXiv preprint arXiv:2203.15664, 2022 | 14 | 2022 |
Going Beyond Linear RL: Sample Efficient Neural Function Approximation B Huang, K Huang, S Kakade, JD Lee, Q Lei, R Wang, J Yang Advances in Neural Information Processing Systems 34, 8968-8983, 2021 | 9 | 2021 |
Fully Gap-Dependent Bounds for Multinomial Logit Bandit J Yang International Conference on Artificial Intelligence and Statistics, 199-207, 2021 | 3 | 2021 |