Learning algorithms for minimizing queue length regret

D Freund, T Lykouris, W Weng - Advances in Neural …, 2024 - proceedings.neurips.cc

Queueing systems are widely applicable stochastic models with use cases in
communication networks, healthcare, service systems, etc. Although their optimal control …

被引用次数：3 相关文章所有 5 个版本

[PDF] mlr.press

Efficient decentralized multi-agent learning in asymmetric queuing systems

D Freund, T Lykouris, W Weng - Conference on Learning …, 2022 - proceedings.mlr.press

We study decentralized multi-agent learning in bipartite queuing systems, a standard model
for service systems. In particular, N agents request service from K servers in a fully …

被引用次数：18 相关文章

[PDF] openreview.net

Learning to schedule tasks with deadline and throughput constraints

Q Liu, Z Fang - IEEE INFOCOM 2023-IEEE Conference on …, 2023 - ieeexplore.ieee.org

We consider the task scheduling scenario where the controller activates one from K task
types at each time. Each task induces a random completion time, and a reward is obtained …

被引用次数：18 相关文章所有 2 个版本

[PDF] acm.org

Job dispatching policies for queueing systems with unknown service rates

T Choudhury, G Joshi, W Wang… - Proceedings of the Twenty …, 2021 - dl.acm.org

In multi-server queueing systems where there is no central queue holding all incoming jobs,
job dispatching policies are used to assign incoming jobs to the queue at one of the servers …

被引用次数：38 相关文章所有 5 个版本

[PDF] aaai.org

Decentralized scheduling with qos constraints: Achieving o (1) qos regret of multi-player bandits

Q Liu, Z Fang - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org

We consider a decentralized multi-player multi-armed bandit (MP-MAB) problem where
players cannot observe the actions and rewards of other players and no explicit …

被引用次数：3 相关文章

[PDF] arxiv.org

Learning to defer in content moderation: The human-ai interplay

T Lykouris, W Weng - arXiv preprint arXiv:2402.12237, 2024 - arxiv.org

Successful content moderation in online platforms relies on a human-AI collaboration
approach. A typical heuristic estimates the expected harmfulness of a post and uses fixed …

被引用次数：6 相关文章所有 2 个版本

[PDF] neurips.cc

Bayesian learning of optimal policies in markov decision processes with countably infinite state-space

S Adler, V Subramanian - Advances in Neural Information …, 2024 - proceedings.neurips.cc

Abstract Models of many real-life applications, such as queueing models of communication
networks or computing systems, have a countably infinite state-space. Algorithmic and …

被引用次数：5 相关文章所有 5 个版本

Gamekeeper: Online learning for admission control of networked open multiagent systems

I Bistritz, N Bambos - IEEE Transactions on Automatic Control, 2024 - ieeexplore.ieee.org

We consider open games where players arrive according to a Poisson process with rate and
stay in the game for an exponential random duration with rate. The game evolves in …

被引用次数：2 相关文章

[HTML] informs.org

Learning-based optimal admission control in a single-server queuing system

A Cohen, V Subramanian, Y Zhang - Stochastic Systems, 2024 - pubsonline.informs.org

We consider a long-term average profit–maximizing admission control problem in an M/M/1
queuing system with unknown service and arrival rates. With a fixed reward collected upon …

被引用次数：7 相关文章所有 4 个版本

Online task scheduling and termination with throughput constraint

Q Liu, Z Fang - IEEE/ACM Transactions on Networking, 2024 - ieeexplore.ieee.org

We consider the task scheduling scenario where the controller activates one from task types
at each time. Each task induces a random completion time, and a reward is obtained only …

被引用次数：1 相关文章所有 4 个版本

高级搜索

QQ 群