Q Shao, J Ye, JCS Lui - Proceedings of the Twenty-fifth International …, 2024 - dl.acm.org
Multi-armed bandits (MAB) is an online learning and decisionmaking model under
uncertainty. Instead of maximizing the expected utility (or reward) in a classical MAB setting …