Learning-augmented mechanism design: Leveraging predictions for facility location P Agrawal, E Balkanski, V Gkatzelis, T Ou, X Tan Proceedings of the 23rd ACM Conference on Economics and Computation, 497-528, 2022 | 30 | 2022 |
Improved worst-case regret bounds for randomized least-squares value iteration P Agrawal, J Chen, N Jiang Proceedings of the AAAI Conference on Artificial Intelligence 35 (8), 6566-6573, 2021 | 22 | 2021 |
A tractable online learning algorithm for the multinomial logit contextual bandit P Agrawal, T Tulabandhula, V Avadhanula European Journal of Operational Research 310 (2), 737-750, 2023 | 14 | 2023 |
Incentivising exploration and recommendations for contextual bandits with payments P Agrawal, T Tulabandhula Multi-Agent Systems and Agreement Technologies: 17th European Conference …, 2020 | 4 | 2020 |
Learning by repetition: Stochastic multi-armed bandits under priming effect P Agrawal, T Tulabandula Conference on Uncertainty in Artificial Intelligence, 470-479, 2020 | 3 | 2020 |
Optimistic Q-learning for average reward and episodic reinforcement learning P Agrawal, S Agrawal arXiv preprint arXiv:2407.13743, 2024 | | 2024 |
Bandits with Temporal Stochastic Constraints P Agrawal, T Tulabandhula arXiv preprint arXiv:1811.09026, 2018 | | 2018 |
Policy Gradient with Tree Search (PGTS) in Reinforcement Learning Evades Local Maxima N Kumar, P Agrawal, KY Levy, S Mannor The Second Tiny Papers Track at ICLR 2024, 0 | | |