Safe reinforcement learning is extremely challenging--not only must the agent explore an unknown environment, it must do so while ensuring no safety constraint violations. We …
X Yi, X Li, T Yang, L Xie, T Chai… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Distributed bandit online convex optimization with time-varying coupled inequality constraints is considered, motivated by a repeated game between a group of learners and …
This paper considers online convex optimization with hard constraints and analyzes achievable regret and cumulative hard constraint violation (violation for short). The problem …
Achieving low-latency is paramount for live streaming scenarios, that are now-days becoming increasingly popular. In this paper, we propose a novel algorithm for bitrate …
X Wei, H Yu, MJ Neely - Proceedings of the ACM on Measurement and …, 2020 - dl.acm.org
We consider online convex optimization with stochastic constraints where the objective functions are arbitrarily time-varying and the constraint functions are independent and …
We study online learning problems in which a decision maker has to make a sequence of costly decisions, with the goal of maximizing their expected reward while adhering to budget …
D Yuan, A Proutiere, G Shi - IEEE Transactions on Automatic …, 2021 - ieeexplore.ieee.org
In this article, we consider distributed online convex optimization problems, where the distributed system consists of various computing units connected through a time-varying …
Caching refers to the act of replicating information at a faster (or closer) medium with the purpose of improving performance. This deceptively simple idea has given rise to some of …
JCN Liang, H Lu, B Zhou - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Today's online advertisers procure digital ad impressions through interacting with autobidding platforms: advertisers convey high level procurement goals via setting levers …