Strategic choices: Small budgets and simple regret

CW Chou, PC Chou, CS Lee… - 2012 Conference on …, 2012 - ieeexplore.ieee.org
CW Chou, PC Chou, CS Lee, DL Saint-Pierre, O Teytaud, MH Wang, LW Wu, SJ Yen
2012 Conference on Technologies and Applications of Artificial …, 2012ieeexplore.ieee.org
In many decision problems, there are two levels of choice: The first one is strategic and the
second is tactical. We formalize the difference between both and discuss the relevance of
the bandit literature for strategic decisions and test the quality of different bandit algorithms
in real world examples such as board games and card games. For exploration-exploitation
algorithm, we evaluate the Upper Confidence Bounds and Exponential Weights, as well as
algorithms designed for simple regret, such as Successive Reject. For the exploitation, we …
In many decision problems, there are two levels of choice: The first one is strategic and the second is tactical. We formalize the difference between both and discuss the relevance of the bandit literature for strategic decisions and test the quality of different bandit algorithms in real world examples such as board games and card games. For exploration-exploitation algorithm, we evaluate the Upper Confidence Bounds and Exponential Weights, as well as algorithms designed for simple regret, such as Successive Reject. For the exploitation, we also evaluate Bernstein Races and Uniform Sampling. As for the recommandation part, we test Empirically Best Arm, Most Played, Lower ConfidenceBounds and Empirical Distribution. In the one-player case, we recommend Upper Confidence Bound as an exploration algorithm (and in particular its variants adaptUCBE for parameter-free simple regret) and Lower Confidence Bound or Most Played Arm as recommendation algorithms. In the two-player case, we point out the commodity and efficiency of the EXP3 algorithm, and the very clear improvement provided by the truncation algorithm TEXP3. Incidentally our algorithm won some games against professional players in kill-all Go (to the best of our knowledge, for the first time in computer games).
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果