Cascading linear submodular bandits: Accounting for position bias and diversity in online...

Bandit algorithms: A comprehensive review and their dynamic selection from a portfolio for multicriteria top-k recommendation

A Letard, N Gutowski, O Camp, T Amghar - Expert Systems with …, 2024 - Elsevier

This paper discusses the use of portfolio approaches based on bandit algorithms to optimize
multicriteria decision-making in recommender systems (accuracy and diversity). While …

被引用次数：5 相关文章所有 2 个版本

[PDF] mlr.press

A framework for adapting offline algorithms to solve combinatorial multi-armed bandit problems with bandit feedback

G Nie, YY Nadew, Y Zhu… - … on Machine Learning, 2023 - proceedings.mlr.press

We investigate the problem of stochastic, combinatorial multi-armed bandits where the
learner only has access to bandit feedback and the reward function can be non-linear. We …

被引用次数：14 相关文章所有 9 个版本

[PDF] acm.org

Mitigating exposure bias in online learning to rank recommendation: A novel reward model for cascading bandits

M Mansoury, B Mobasher, H van Hoof - Proceedings of the 33rd ACM …, 2024 - dl.acm.org

Exposure bias is a well-known issue in recommender systems where items and suppliers
are not equally represented in the recommendation results. This bias becomes particularly …

被引用次数：3 相关文章所有 5 个版本

[PDF] neurips.cc

Minimax regret for cascading bandits

D Vial, S Sanghavi, S Shakkottai… - Advances in Neural …, 2022 - proceedings.neurips.cc

Cascading bandits is a natural and popular model that frames the task of learning to rank
from Bernoulli click feedback in a bandit setting. For the case of unstructured rewards, we …

被引用次数：16 相关文章所有 7 个版本

[PDF] aaai.org

A hybrid bandit framework for diversified recommendation

Q Ding, Y Liu, C Miao, F Cheng, H Tang - Proceedings of the AAAI …, 2021 - ojs.aaai.org

The interactive recommender systems involve users in the recommendation procedure by
receiving timely user feedback to update the recommendation policy. Therefore, they are …

被引用次数：29 相关文章所有 6 个版本

[PDF] arxiv.org

Cascading hybrid bandits: Online learning to rank for relevance and diversity

C Li, H Feng, M Rijke - Proceedings of the 14th ACM Conference on …, 2020 - dl.acm.org

Relevance ranking and result diversification are two core areas in modern recommender
systems. Relevance ranking aims at building a ranked list sorted in decreasing order of item …

被引用次数：38 相关文章所有 6 个版本

[PDF] mlr.press

On the value of prior in online learning to rank

B Kveton, O Meshi, M Zoghi… - … Conference on Artificial …, 2022 - proceedings.mlr.press

This paper addresses the cold-start problem in online learning to rank (OLTR). We show
both theoretically and empirically that priors improve the quality of ranked lists presented to …

被引用次数：12 相关文章所有 4 个版本

[PDF] mlr.press

Submodular bandit problem under multiple constraints

S Takemori, M Sato, T Sonoda… - … on Uncertainty in …, 2020 - proceedings.mlr.press

The linear submodular bandit problemwas proposedto simultaneously address diversified
retrieval and online learning in a recommender system. If there is no uncertainty, this …

被引用次数：18 相关文章所有 9 个版本

[PDF] nsf.gov

Learning to make decisions via submodular regularization

A Alieva, A Aceves, J Song, S Mayo, Y Yue… - … Conference on Learning …, 2020 - par.nsf.gov

Many sequential decision making tasks can be viewed as combinatorial optimiza-tion
problems over a large number of actions. When the cost of evaluating an ac-tion is high …

被引用次数：14 相关文章所有 6 个版本

[PDF] aaai.org

Context uncertainty in contextual bandits with applications to recommender systems

H Wang, Y Ma, H Ding, Y Wang - … of the AAAI Conference on Artificial …, 2022 - ojs.aaai.org

Recurrent neural networks have proven effective in modeling sequential user feedbacks for
recommender systems. However, they usually focus solely on item relevance and fail to …

被引用次数：8 相关文章所有 9 个版本

高级搜索

QQ 群