Reinforcement learning for dynamic dimensioning of cloud caches: A restless bandit approach

Finite-time analysis of whittle index based Q-learning for restless multi-armed bandits with neural network function approximation

G Xiong, J Li - Advances in Neural Information Processing …, 2023 - proceedings.neurips.cc

Whittle index policy is a heuristic to the intractable restless multi-armed bandits (RMAB)
problem. Although it is provably asymptotically optimal, finding Whittle indices remains …

被引用次数：14 相关文章所有 8 个版本

[PDF] neurips.cc

Learning infinite-horizon average-reward restless multi-action bandits via index awareness

G Xiong, S Wang, J Li - Advances in Neural Information …, 2022 - proceedings.neurips.cc

We consider the online restless bandits with average-reward and multiple actions, where the
state of each arm evolves according to a Markov decision process (MDP), and the reward of …

被引用次数：14 相关文章所有 5 个版本

[PDF] acm.org

Index-aware reinforcement learning for adaptive video streaming at the wireless edge

G Xiong, X Qin, B Li, R Singh, J Li - Proceedings of the Twenty-Third …, 2022 - dl.acm.org

We study adaptive video streaming for multiple users in wireless access edge networks with
unreliable channels. The key challenge is to jointly optimize the video bitrate adaptation and …

被引用次数：20 相关文章所有 4 个版本

[PDF] arxiv.org

Prioritized information bottleneck theoretic framework with distributed online learning for edge video analytics

Z Fang, S Hu, J Wang, Y Deng… - IEEE Transactions on …, 2025 - ieeexplore.ieee.org

Collaborative perception systems leverage multiple edge devices, such as surveillance
cameras or autonomous cars, to enhance sensing quality and eliminate blind spots. Despite …

被引用次数：3 相关文章所有 3 个版本

[PDF] aaai.org

Online restless multi-armed bandits with long-term fairness constraints

S Wang, G Xiong, J Li - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

Restless multi-armed bandits (RMAB) have been widely used to model sequential decision
making problems with constraints. The decision maker (DM) aims to maximize the expected …

被引用次数：4 相关文章所有 4 个版本

[PDF] arxiv.org

Towards Foundation-model-based Multiagent System to Accelerate AI for Social Impact

Y Zhao, N Boehmer, A Taneja, M Tambe - arXiv preprint arXiv:2412.07880, 2024 - arxiv.org

AI for social impact (AI4SI) offers significant potential for addressing complex societal
challenges in areas such as public health, agriculture, education, conservation, and public …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Dopl: Direct online preference learning for restless bandits with preference feedback

G Xiong, U Dinesha, D Mukherjee, J Li… - arXiv preprint arXiv …, 2024 - arxiv.org

Restless multi-armed bandits (RMAB) has been widely used to model constrained
sequential decision making problems, where the state of each restless arm evolves …

被引用次数：1 相关文章所有 2 个版本

Crowd²: Multi-agent Bandit-based Dispatch for Video Analytics upon Crowdsourcing

Y Chen, S Zhang, Y Yan, Y Jin, N Chen… - … -IEEE Conference on …, 2023 - ieeexplore.ieee.org

Many crowdsourcing platforms are emerging, leveraging the resources of recruited workers
to execute various outsourcing tasks, mainly for those computing-intensive video analytics …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Congestion-aware routing and content placement in elastic cache networks

J Zhang, E Yeh - IEEE INFOCOM 2024-IEEE Conference on …, 2024 - ieeexplore.ieee.org

Caching can be leveraged to significantly improve network performance and mitigate
congestion. However, characterizing the optimal tradeoff between routing cost and cache …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Whittle index-based q-learning for wireless edge caching with linear function approximation

G Xiong, S Wang, J Li, R Singh - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org

We consider the problem of content caching at the wireless edge to serve a set of end users
via unreliable wireless channels so as to minimize the average latency experienced by end …

被引用次数：6 相关文章所有 2 个版本

高级搜索

QQ 群