[图书][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

An efficiency-boosting client selection scheme for federated learning with fairness guarantee

T Huang, W Lin, W Wu, L He, K Li… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
The issue of potential privacy leakage during centralized AI's model training has drawn
intensive concern from the public. A Parallel and Distributed Computing (or PDC) scheme …

Multi-armed bandits in recommendation systems: A survey of the state-of-the-art and future directions

N Silva, H Werneck, T Silva, ACM Pereira… - Expert Systems with …, 2022 - Elsevier
Abstract Recommender Systems (RSs) have assumed a crucial role in several digital
companies by directly affecting their key performance indicators. Nowadays, in this era of big …

A review of client selection methods in federated learning

S Mayhoub, T M. Shami - Archives of Computational Methods in …, 2024 - Springer
Federated learning (FL) is a promising new technology that allows machine learning (ML)
models to be trained locally on edge devices while preserving the privacy of the devices' …

Reinforcement learning to rank in e-commerce search engine: Formalization, analysis, and application

Y Hu, Q Da, A Zeng, Y Yu, Y Xu - Proceedings of the 24th ACM SIGKDD …, 2018 - dl.acm.org
In E-commerce platforms such as Amazon and TaoBao, ranking items in a search session is
a typical multi-step decision-making problem. Learning to rank (LTR) methods have been …

Spatio–temporal edge service placement: A bandit learning approach

L Chen, J Xu, S Ren, P Zhou - IEEE Transactions on Wireless …, 2018 - ieeexplore.ieee.org
Shared edge computing platforms deployed at the radio access network are expected to
significantly improve the quality-of-service delivered by application service providers (ASPs) …

Online learning to rank in stochastic click models

M Zoghi, T Tunys, M Ghavamzadeh… - International …, 2017 - proceedings.mlr.press
Online learning to rank is a core problem in information retrieval and machine learning.
Many provably efficient algorithms have been recently proposed for this problem in specific …

Contextual combinatorial bandits with probabilistically triggered arms

X Liu, J Zuo, S Wang, JCS Lui… - International …, 2023 - proceedings.mlr.press
We study contextual combinatorial bandits with probabilistically triggered arms (C $^ 2$
MAB-T) under a variety of smoothness conditions that capture a wide range of applications …

Residential HVAC aggregation based on risk-averse multi-armed bandit learning for secondary frequency regulation

X Chen, Q Hu, Q Shi, X Quan, Z Wu… - Journal of Modern Power …, 2020 - ieeexplore.ieee.org
As the penetration of renewable energy continues to increase, stochastic and intermittent
generation resources gradually replace the conventional generators, bringing significant …

Batch-size independent regret bounds for combinatorial semi-bandits with probabilistically triggered arms or independent arms

X Liu, J Zuo, S Wang, C Joe-Wong… - Advances in Neural …, 2022 - proceedings.neurips.cc
In this paper, we study the combinatorial semi-bandits (CMAB) and focus on reducing the
dependency of the batch-size $ K $ in the regret bound, where $ K $ is the total number of …