Multi-agent multi-armed bandit learning for online management of edge-assisted computing

B Wu, T Chen, W Ni, X Wang - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
By orchestrating resources of edge and core network, the delays of edge-assisted
computing can decrease. Offloading scheduling is challenging though, especially in the …

Practical contextual bandits with feedback graphs

M Zhang, Y Zhang, O Vrousgou… - Advances in Neural …, 2024 - proceedings.neurips.cc
While contextual bandit has a mature theory, effectively leveraging different feedback
patterns to enhance the pace of learning remains unclear. Bandits with feedback graphs …

Asymptotically-optimal gaussian bandits with side observations

A Atsidakou, O Papadigenopoulos… - International …, 2022 - proceedings.mlr.press
We study the problem of Gaussian bandits with general side information, as first introduced
by Wu, Szepesvári, and György. In this setting, the play of an arm reveals information about …

Efficient contextual bandits with uninformed feedback graphs

M Zhang, Y Zhang, H Luo, P Mineiro - arXiv preprint arXiv:2402.08127, 2024 - arxiv.org
Bandits with feedback graphs are powerful online learning models that interpolate between
the full information and classic bandit problems, capturing many real-life applications. A …

Efficient algorithms for multi-armed bandits with additional feedbacks: Modeling and algorithms

H Xie, H Gu, Z Qi - Information Sciences, 2023 - Elsevier
Multi-armed bandits (MAB) are widely applied to optimize networking applications such as
crowdsensing and mobile edge computing. Additional feedbacks (or partial feedbacks) on …

Stochastic Graph Bandit Learning with Side-Observations

X Gong, J Zhang - arXiv preprint arXiv:2308.15107, 2023 - arxiv.org
In this paper, we investigate the stochastic contextual bandit with general function space and
graph feedback. We propose an algorithm that addresses this problem by adapting to both …

Near-optimal algorithms for piecewise-stationary cascading bandits

L Wang, H Zhou, B Li, LR Varshney… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
Cascading bandit (CB) is a popular model for web search and online advertising. However,
the stationary CB model may be too simple to cope with real-world problems, where user …

[PDF][PDF] SEQUENTIAL PREDICTION AND DECISION MAKING, JOINT COMMUNITY DETECTION AND PHASE SYNCHRONIZATION, AND ACCELERATION …

L WANG - 2024 - lingdawang.github.io
In this dissertation, several topics in machine learning are presented, including sequential
prediction and decision making (eg, time series/spatiotemporal sequences prediction and …

Improved Regret Bounds in Stochastic Contextual Bandits with Graph Feedback

X Gong, J Zhang - openreview.net
This paper investigates the stochastic contextual bandit problem with general function space
and graph feedback. We propose a novel algorithm that effectively adapts to the time …