Kullback-Leibler upper confidence bounds for optimal sequential allocation O Cappé, A Garivier, OA Maillard, R Munos, G Stoltz The Annals of Statistics, 1516-1541, 2013 | 428 | 2013 |
Concentration inequalities for sampling without replacement R Bardenet, OA Maillard | 198 | 2015 |
CATS, a low pressure multiwire proportionnal chamber for secondary beam tracking at GANIL S Ottini-Hustache, C Mazur, F Auger, A Musumarra, N Alamanos, ... Nuclear Instruments and Methods in Physics Research Section A: Accelerators …, 1999 | 167 | 1999 |
A finite-time analysis of multi-armed bandits problems with kullback-leibler divergences OA Maillard, R Munos, G Stoltz Proceedings of the 24th annual Conference On Learning Theory, 497-514, 2011 | 161 | 2011 |
Compressed least-squares regression OA Maillard, R Munos Advances in Neural Information Processing Systems, 2009 | 135 | 2009 |
Latent Bandits. OA Maillard, S Mannor International Conference on Machine Learning, 136-144, 2014 | 106 | 2014 |
The non-stationary stochastic multi-armed bandit problem R Allesiardo, R Féraud, OA Maillard International Journal of Data Science and Analytics 3, 267-283, 2017 | 89 | 2017 |
Robust risk-averse stochastic multi-armed bandits OA Maillard Algorithmic Learning Theory: 24th International Conference, ALT 2013 …, 2013 | 77 | 2013 |
Variance-aware regret bounds for undiscounted reinforcement learning in mdps MS Talebi, OA Maillard Algorithmic Learning Theory, 770-805, 2018 | 74 | 2018 |
LSTD with random projections M Ghavamzadeh, A Lazaric, OA Maillard, R Munos Advances in Neural Information Processing Systems 23, 721--729, 2010 | 74 | 2010 |
Sub-sampling for multi-armed bandits A Baransi, OA Maillard, S Mannor Machine Learning and Knowledge Discovery in Databases: European Conference …, 2014 | 64 | 2014 |
PICOSEC: Charged particle timing at sub-25 picosecond precision with a Micromegas based detector J Bortfeldt, F Brunbauer, C David, D Desforge, G Fanourakis, J Franchi, ... Nuclear Instruments and Methods in Physics Research Section A: Accelerators …, 2018 | 62 | 2018 |
How hard is my MDP?" The distribution-norm to the rescue" OA Maillard, TA Mann, S Mannor Advances in Neural Information Processing Systems 27, 2014 | 61 | 2014 |
Linear regression with random projections O Maillard, R Munos Journal of Machine Learning Research 13 (1), 2735-2772, 2012 | 61 | 2012 |
Online learning in adversarial lipschitz environments OA Maillard, R Munos Joint european conference on machine learning and knowledge discovery in …, 2010 | 54 | 2010 |
Finite-sample analysis of Bellman residual minimization OA Maillard, R Munos, A Lazaric, M Ghavamzadeh Proceedings of 2nd Asian Conference on Machine Learning, 299-314, 2010 | 48 | 2010 |
Selecting the state-representation in reinforcement learning OA Maillard, D Ryabko, R Munos Advances in Neural Information Processing Systems 24, 2011 | 47 | 2011 |
Optimal thompson sampling strategies for support-aware cvar bandits D Baudry, R Gautron, E Kaufmann, O Maillard International Conference on Machine Learning, 716-726, 2021 | 40 | 2021 |
Adaptive Bandits: Towards the best history-dependent strategy OA Maillard, R Munos Proceedings of the Fourteenth International Conference on Artificial …, 2011 | 40* | 2011 |
Tightening exploration in upper confidence reinforcement learning H Bourel, O Maillard, MS Talebi International Conference on Machine Learning, 1056-1066, 2020 | 36 | 2020 |