- 学术资源搜索

[图书][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

被引用次数：3145 相关文章所有 9 个版本

[PDF] arxiv.org

" Deep reinforcement learning for search, recommendation, and online advertising: a survey" by Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin with Martin …

X Zhao, L Xia, J Tang, D Yin - ACM sigweb newsletter, 2019 - dl.acm.org

Search, recommendation, and online advertising are the three most important information-
providing mechanisms on the web. These information seeking techniques, satisfying users' …

被引用次数：118 相关文章所有 8 个版本

[PDF] mlr.press

Hierarchical bayesian bandits

J Hong, B Kveton, M Zaheer… - International …, 2022 - proceedings.mlr.press

Abstract Meta-, multi-task, and federated learning can be all viewed as solving similar tasks,
drawn from a distribution that reflects task similarities. We provide a unified view of all these …

被引用次数：45 相关文章所有 4 个版本

[PDF] arxiv.org

Cascading bandits for large-scale recommendation problems

S Zong, H Ni, K Sung, NR Ke, Z Wen… - arXiv preprint arXiv …, 2016 - arxiv.org

Most recommender systems recommend a list of items. The user examines the list, from the
first item to the last, and often chooses the first attractive item and does not examine the rest …

被引用次数：131 相关文章所有 11 个版本

[PDF] arxiv.org

Carousel personalization in music streaming apps with contextual bandits

W Bendada, G Salha, T Bontempelli - … of the 14th ACM Conference on …, 2020 - dl.acm.org

Media services providers, such as music streaming platforms, frequently leverage swipeable
carousels to recommend personalized content to their users. However, selecting the most …

被引用次数：59 相关文章所有 5 个版本

[PDF] jonathanstray.com

A visual dialog augmented interactive recommender system

T Yu, Y Shen, H Jin - Proceedings of the 25th ACM SIGKDD international …, 2019 - dl.acm.org

Traditional recommender systems rely on user feedback such as ratings or clicks to the
items, to analyze the user interest and provide personalized recommendations. However …

被引用次数：74 相关文章所有 2 个版本

[PDF] arxiv.org

Unbiased learning to rank: online or offline?

Q Ai, T Yang, H Wang, J Mao - ACM Transactions on Information …, 2021 - dl.acm.org

How to obtain an unbiased ranking model by learning to rank with biased user feedback is
an important research question for IR. Existing work on unbiased learning to rank (ULTR) …

被引用次数：66 相关文章所有 3 个版本

[PDF] mlr.press

Online learning to rank in stochastic click models

M Zoghi, T Tunys, M Ghavamzadeh… - International …, 2017 - proceedings.mlr.press

Online learning to rank is a core problem in information retrieval and machine learning.
Many provably efficient algorithms have been recently proposed for this problem in specific …

被引用次数：113 相关文章所有 11 个版本

[PDF] neurips.cc

Multiple-play bandits in the position-based model

P Lagrée, C Vernade, O Cappe - Advances in Neural …, 2016 - proceedings.neurips.cc

Sequentially learning to place items in multi-position displays or lists is a task that can be
cast into the multiple-play semi-bandit setting. However, a major concern in this context is …

被引用次数：103 相关文章所有 11 个版本

[PDF] mlr.press

Stochastic bandits with delay-dependent payoffs

L Cella, N Cesa-Bianchi - International Conference on …, 2020 - proceedings.mlr.press

Motivated by recommendation problems in music streaming platforms, we propose a
nonstationary stochastic bandit model in which the expected reward of an arm depends on …

被引用次数：53 相关文章所有 10 个版本

高级搜索

QQ 群

[图书][B] Bandit algorithms

" Deep reinforcement learning for search, recommendation, and online advertising: a survey" by Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin with Martin …

Hierarchical bayesian bandits

Cascading bandits for large-scale recommendation problems

Carousel personalization in music streaming apps with contextual bandits

A visual dialog augmented interactive recommender system

Unbiased learning to rank: online or offline?

Online learning to rank in stochastic click models

Multiple-play bandits in the position-based model

Stochastic bandits with delay-dependent payoffs

引用