When are linear stochastic bandits attackable?

J Zuo, Z Zhang, Z Wang, S Li… - Advances in …, 2023 - proceedings.neurips.cc

Online learning to rank (OLTR) is a sequential decision-making problem where a learning
agent selects an ordered list of items and receives feedback through user clicks. Although …

被引用次数：3 相关文章所有 5 个版本

[PDF] arxiv.org

Byzantine-resilient decentralized multi-armed bandits

J Zhu, A Koppel, A Velasquez, J Liu - arXiv preprint arXiv:2310.07320, 2023 - arxiv.org

In decentralized cooperative multi-armed bandits (MAB), each agent observes a distinct
stream of rewards, and seeks to exchange information with others to select a sequence of …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Adversarial attacks on combinatorial multi-armed bandits

R Balasubramanian, J Li, P Tadepalli, H Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

We study reward poisoning attacks on Combinatorial Multi-armed Bandits (CMAB). We first
provide a sufficient and necessary condition for the attackability of CMAB, which depends on …

被引用次数：1 相关文章所有 4 个版本

[PDF] arxiv.org

Reward Teaching for Federated Multiarmed Bandits

C Shi, W Xiong, C Shen, J Yang - IEEE Transactions on Signal …, 2023 - ieeexplore.ieee.org

Most of the existing federated multi-armed bandits (FMAB) designs are based on the
presumption that clients will implement the specified design to collaborate with the server. In …

被引用次数：1 相关文章所有 9 个版本

[PDF] aaai.org

Stealthy Adversarial Attacks on Stochastic Multi-Armed Bandits

Z Wang, H Wang, H Wang - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

Adversarial attacks against stochastic multi-armed bandit (MAB) algorithms have been
extensively studied in the literature. In this work, we focus on reward poisoning attacks and …

Follow-ups also matter: improving contextual bandits via post-serving contexts

C Wang, Z Ye, Z Feng… - Advances in …, 2024 - proceedings.neurips.cc

Standard contextual bandit problem assumes that all the relevant contexts are observed
before the algorithm chooses an arm. This modeling paradigm, while useful, often falls short …

Action Poisoning Attacks on Linear Contextual Bandits

G Liu, L Lai - Transactions on Machine Learning Research, 2022 - openreview.net

Contextual bandit algorithms have many applicants in a variety of scenarios. In order to
develop trustworthy contextual bandit systems, understanding the impacts of various …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Adversarial attacks on adversarial bandits

Y Ma, Z Zhou - arXiv preprint arXiv:2301.12595, 2023 - arxiv.org

We study a security threat to adversarial multi-armed bandits, in which an attacker perturbs
the loss or reward signal to control the behavior of the victim bandit player. We show that the …

被引用次数：8 相关文章所有 4 个版本

[PDF] nsf.gov

Teaching Reinforcement Learning Agents via Reinforcement Learning

K Yang, C Shi, C Shen - 2023 57th Annual Conference on …, 2023 - ieeexplore.ieee.org

In many real-world reinforcement learning (RL) tasks, the agent who takes the actions often
only has partial observations of the environment. On the other hand, a principal may have a …

Adversarial Attacks and Robustness of Combinatorial Multi-Armed Bandits

R Balasubramanian - 2024 - ir.library.oregonstate.edu

We study reward poisoning attacks on Combinatorial Multi-armed Bandits (CMAB). We first
provide a sufficient and necessary condition for the attackability of CMAB, a notion to capture …

高级搜索

QQ 群