Apprenticeship learning using inverse reinforcement learning and gradient methods G Neu, C Szepesvári Proc. UAI, 295-302, 2007 | 319* | 2007 |
A unified view of entropy-regularized Markov decision processes G Neu, A Jonsson, V Gómez arXiv preprint arXiv:1705.07798, 2017 | 269 | 2017 |
Boltzmann Exploration Done Right N Cesa-Bianchi, C Gentile, G Lugosi, G Neu Neural Information Processing Systems (NIPS), 6287-6296, 2017 | 213 | 2017 |
Online Markov decision processes under bandit feedback G Neu, A Antos, A György, C Szepesvári Advances in Neural Information Processing Systems 23, 2010 | 210 | 2010 |
Explore no more: Improved high-probability regret bounds for non-stochastic bandits G Neu Neural Information Processing Systems (NIPS), 2015 | 174 | 2015 |
Online Learning in Episodic Markovian Decision Processes by Relative Entropy Policy Search A Zimin, G Neu Neural Information Processing Systems (NIPS), 2013 | 143 | 2013 |
Efficient learning by implicit exploration in bandit problems with side observations T Kocák, G Neu, M Valko, R Munos Neural Information Processing Systems (NIPS), 2014 | 130 | 2014 |
Training parsers by inverse reinforcement learning G Neu, C Szepesvári Machine learning 77 (2), 303-337, 2009 | 94 | 2009 |
Algorithmic stability and hypothesis complexity T Liu, G Lugosi, G Neu, D Tao Proceedings of the 34th International Conference on Machine Learning, 2159-2167, 2017 | 93 | 2017 |
The adversarial stochastic shortest path problem with unknown transition probabilities G Neu, A György, C Szepesvári AI & Statistics, 2012 | 93 | 2012 |
An efficient algorithm for learning with semi-bandit feedback G Neu, G Bartók Algorithmic Learning Theory (ALT 2013), 2013 | 91 | 2013 |
The online loop-free stochastic shortest-path problem G Neu, A György, C Szepesvári The 23rd Annual Conference on Learning Theory (COLT 2010), 2010 | 80 | 2010 |
Information-Theoretic Generalization Bounds for Stochastic Gradient Descent G Neu, GK Dziugaite, M Haghifam, DM Roy The 34th Annual Conference on Learning Theory (COLT 2020), 3526-3545, 2021 | 76 | 2021 |
A unifying view of optimism in episodic reinforcement learning G Neu, C Pike-Burke Advances in Neural Information Processing Systems 33, 2020 | 70 | 2020 |
Iterate averaging as regularization for stochastic gradient descent G Neu, L Rosasco The 31st Annual Conference on Learning Theory (COLT 2018), 3222-3242, 2018 | 67 | 2018 |
Collaborative spatial reuse in wireless networks via selfish multi-armed bandits F Wilhelmi, C Cano, G Neu, B Bellalta, A Jonsson, S Barrachina-Muñoz Ad Hoc Networks 88, 129-141, 2019 | 62 | 2019 |
Exploiting easy data in online optimization A Sani, G Neu, A Lazaric Neural Information Processing Systems (NIPS), 2014 | 60 | 2014 |
Potential and Pitfalls of Multi-Armed Bandits for Decentralized Spatial Reuse in WLANs F Wilhelmi, S Barrachina-Muñoz, B Bellalta, C Cano, A Jonsson, G Neu Journal of Network and Computer Applications 127, 26-42, 2019 | 58 | 2019 |
First-order regret bounds for combinatorial semi-bandits G Neu The 28th Annual Conference on Learning Theory (COLT 2015), 1360–1375, 2015 | 56 | 2015 |
Logistic Q-Learning J Bas-Serrano, S Curi, A Krause, G Neu International Conference on Artificial Intelligence and Statistics, 3610-3618, 2021 | 50 | 2021 |