L Yang, M Wang - International Conference on Machine …, 2020 - proceedings.mlr.press
Exploration in reinforcement learning (RL) suffers from the curse of dimensionality when the state-action space is large. A common practice is to parameterize the high-dimensional …
J He, D Zhou, Q Gu - International Conference on Machine …, 2021 - proceedings.mlr.press
Reinforcement learning (RL) with linear function approximation has received increasing attention recently. However, existing work has focused on obtaining $\sqrt {T} $-type regret …
D Zhou, J He, Q Gu - International Conference on Machine …, 2021 - proceedings.mlr.press
Modern tasks in reinforcement learning have large state and action spaces. To deal with them efficiently, one often uses predefined feature mapping to represent states and actions …
Policy gradient methods are among the most effective methods for large-scale reinforcement learning, and their empirical success has prompted several works that develop the …
P Auer, P Gajane, R Ortner - Conference on Learning Theory, 2019 - proceedings.mlr.press
We consider the variant of the stochastic multi-armed bandit problem where the stochastic reward distributions may change abruptly several times. In contrast to previous work, we are …
J Zimmert, Y Seldin - The 22nd International Conference on …, 2019 - proceedings.mlr.press
We derive an algorithm that achieves the optimal (up to constants) pseudo-regret in both adversarial and stochastic multi-armed bandits without prior knowledge of the regime and …
J Zimmert, Y Seldin - Journal of Machine Learning Research, 2021 - jmlr.org
We derive an algorithm that achieves the optimal (within constants) pseudo-regret in both adversarial and stochastic multi-armed bandits without prior knowledge of the regime and …
Motivated by emerging need of learning algorithms for large scale networked and decentralized systems, we introduce a distributed version of the classical stochastic Multi …
We study the problem of constrained efficient global optimization, where both the objective and constraints are expensive black-box functions that can be learned with Gaussian …