Strategic choices: Small budgets and simple regret- 学术资源搜索

Strategic choices: Small budgets and simple regret

CW Chou, PC Chou, CS Lee… - 2012 Conference on …, 2012 - ieeexplore.ieee.org

CW Chou, PC Chou, CS Lee, DL Saint-Pierre, O Teytaud, MH Wang, LW Wu, SJ Yen

2012 Conference on Technologies and Applications of Artificial …, 2012•ieeexplore.ieee.org

In many decision problems, there are two levels of choice: The first one is strategic and the second is tactical. We formalize the difference between both and discuss the relevance of the bandit literature for strategic decisions and test the quality of different bandit algorithms in real world examples such as board games and card games. For exploration-exploitation algorithm, we evaluate the Upper Confidence Bounds and Exponential Weights, as well as algorithms designed for simple regret, such as Successive Reject. For the exploitation, we also evaluate Bernstein Races and Uniform Sampling. As for the recommandation part, we test Empirically Best Arm, Most Played, Lower ConfidenceBounds and Empirical Distribution. In the one-player case, we recommend Upper Confidence Bound as an exploration algorithm (and in particular its variants adaptUCBE for parameter-free simple regret) and Lower Confidence Bound or Most Played Arm as recommendation algorithms. In the two-player case, we point out the commodity and efficiency of the EXP3 algorithm, and the very clear improvement provided by the truncation algorithm TEXP3. Incidentally our algorithm won some games against professional players in kill-all Go (to the best of our knowledge, for the first time in computer games).

ieeexplore.ieee.org

展开收起

被引用次数：9 相关文章所有 12 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

Strategic choices: Small budgets and simple regret

引用