Adversarial attacks on online learning to rank with click feedback

J Zuo, Z Zhang, Z Wang, S Li… - Advances in …, 2023 - proceedings.neurips.cc
Online learning to rank (OLTR) is a sequential decision-making problem where a learning
agent selects an ordered list of items and receives feedback through user clicks. Although …

Byzantine-resilient decentralized multi-armed bandits

J Zhu, A Koppel, A Velasquez, J Liu - arXiv preprint arXiv:2310.07320, 2023 - arxiv.org
In decentralized cooperative multi-armed bandits (MAB), each agent observes a distinct
stream of rewards, and seeks to exchange information with others to select a sequence of …

Adversarial attacks on combinatorial multi-armed bandits

R Balasubramanian, J Li, P Tadepalli, H Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
We study reward poisoning attacks on Combinatorial Multi-armed Bandits (CMAB). We first
provide a sufficient and necessary condition for the attackability of CMAB, which depends on …

Reward Teaching for Federated Multiarmed Bandits

C Shi, W Xiong, C Shen, J Yang - IEEE Transactions on Signal …, 2023 - ieeexplore.ieee.org
Most of the existing federated multi-armed bandits (FMAB) designs are based on the
presumption that clients will implement the specified design to collaborate with the server. In …

Stealthy Adversarial Attacks on Stochastic Multi-Armed Bandits

Z Wang, H Wang, H Wang - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org
Adversarial attacks against stochastic multi-armed bandit (MAB) algorithms have been
extensively studied in the literature. In this work, we focus on reward poisoning attacks and …

Follow-ups also matter: improving contextual bandits via post-serving contexts

C Wang, Z Ye, Z Feng… - Advances in …, 2024 - proceedings.neurips.cc
Standard contextual bandit problem assumes that all the relevant contexts are observed
before the algorithm chooses an arm. This modeling paradigm, while useful, often falls short …

Action Poisoning Attacks on Linear Contextual Bandits

G Liu, L Lai - Transactions on Machine Learning Research, 2022 - openreview.net
Contextual bandit algorithms have many applicants in a variety of scenarios. In order to
develop trustworthy contextual bandit systems, understanding the impacts of various …

Adversarial attacks on adversarial bandits

Y Ma, Z Zhou - arXiv preprint arXiv:2301.12595, 2023 - arxiv.org
We study a security threat to adversarial multi-armed bandits, in which an attacker perturbs
the loss or reward signal to control the behavior of the victim bandit player. We show that the …

Teaching Reinforcement Learning Agents via Reinforcement Learning

K Yang, C Shi, C Shen - 2023 57th Annual Conference on …, 2023 - ieeexplore.ieee.org
In many real-world reinforcement learning (RL) tasks, the agent who takes the actions often
only has partial observations of the environment. On the other hand, a principal may have a …

Adversarial Attacks and Robustness of Combinatorial Multi-Armed Bandits

R Balasubramanian - 2024 - ir.library.oregonstate.edu
We study reward poisoning attacks on Combinatorial Multi-armed Bandits (CMAB). We first
provide a sufficient and necessary condition for the attackability of CMAB, a notion to capture …