Borda regret minimization for generalized linear dueling bandits Y Wu, T Jin, H Lou, F Farnoud, Q Gu ICML2024, 2023 | 6 | 2023 |
Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits Q Di, T Jin, Y Wu, H Zhao, F Farnoud, Q Gu International Conference on Learning Representations 2024, 2023 | 5 | 2023 |
Pessimistic nonlinear least-squares value iteration for offline reinforcement learning Q Di, H Zhao, J He, Q Gu International Conference on Learning Representations 2024, 2023 | 4 | 2023 |
Nearly optimal algorithms for contextual dueling bandits from adversarial feedback Q Di, J He, Q Gu arXiv preprint arXiv:2404.10776, 2024 | 1 | 2024 |
Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path Q Di, J He, D Zhou, Q Gu International Conference on Machine Learning, 2023 | | 2023 |