IMO: Interactive Multi-Objective Off-Policy Optimization

N Wang, H Wang, M Karimzadehgan, B Kveton… - arXiv preprint arXiv …, 2022 - arxiv.org
Most real-world optimization problems have multiple objectives. A system designer needs to
find a policy that trades off these objectives to reach a desired operating point. This problem …

Guaranteed Fixed-Confidence Best Arm Identification in Multi-Armed Bandits: Simple Sequential Elimination Algorithms

MJ Azizi, SM Ross, Z Zhang - arXiv preprint arXiv:2106.06848, 2021 - arxiv.org
We consider the problem of finding, through adaptive sampling, which of $ n $ options
(arms) has the largest mean. Our objective is to determine a rule which identifies the best …