Learning contextual bandits in a non-stationary environment

W Lei, G Zhang, X He, Y Miao, X Wang… - Proceedings of the 26th …, 2020 - dl.acm.org

Traditional recommendation systems estimate user preference on items from past interaction
history, thus suffering from the limitations of obtaining fine-grained and dynamic user …

被引用次数：255 相关文章所有 7 个版本

[PDF] arxiv.org

Estimation-action-reflection: Towards deep interaction between conversational and recommender systems

W Lei, X He, Y Miao, Q Wu, R Hong, MY Kan… - Proceedings of the 13th …, 2020 - dl.acm.org

Recommender systems are embracing conversational technologies to obtain user
preferences dynamically, and to overcome inherent limitations of their static models. A …

被引用次数：305 相关文章所有 12 个版本

Multi-armed bandits in recommendation systems: A survey of the state-of-the-art and future directions

N Silva, H Werneck, T Silva, ACM Pereira… - Expert Systems with …, 2022 - Elsevier

Abstract Recommender Systems (RSs) have assumed a crucial role in several digital
companies by directly affecting their key performance indicators. Nowadays, in this era of big …

被引用次数：74 相关文章所有 2 个版本

[PDF] neurips.cc

Weighted linear bandits for non-stationary environments

Y Russac, C Vernade, O Cappé - Advances in Neural …, 2019 - proceedings.neurips.cc

We consider a stochastic linear bandit model in which the available actions correspond to
arbitrary context vectors whose associated rewards follow a non-stationary linear regression …

被引用次数：142 相关文章所有 14 个版本

[PDF] ustc.edu.cn

Seamlessly unifying attributes and items: Conversational recommendation for cold-start users

S Li, W Lei, Q Wu, X He, P Jiang, TS Chua - ACM Transactions on …, 2021 - dl.acm.org

Static recommendation methods like collaborative filtering suffer from the inherent limitation
of performing real-time personalization for cold-start users. Online recommendation, eg …

被引用次数：130 相关文章所有 7 个版本

[PDF] arxiv.org

" Deep reinforcement learning for search, recommendation, and online advertising: a survey" by Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin with Martin …

X Zhao, L Xia, J Tang, D Yin - ACM sigweb newsletter, 2019 - dl.acm.org

Search, recommendation, and online advertising are the three most important information-
providing mechanisms on the web. These information seeking techniques, satisfying users' …

被引用次数：122 相关文章所有 8 个版本

[PDF] mlr.press

Fair contextual multi-armed bandits: Theory and experiments

Y Chen, A Cuellar, H Luo, J Modi… - … on Uncertainty in …, 2020 - proceedings.mlr.press

When an AI system interacts with multiple users, it frequently needs to make allocation
decisions. For instance, a virtual agent decides whom to pay attention to in a group, or a …

被引用次数：71 相关文章所有 11 个版本

[PDF] neurips.cc

Dynamic causal Bayesian optimization

V Aglietti, N Dhir, J González… - Advances in Neural …, 2021 - proceedings.neurips.cc

We study the problem of performing a sequence of optimal interventions in a dynamic causal
system where both the target variable of interest, and the inputs, evolve over time. This …

被引用次数：33 相关文章所有 11 个版本

[PDF] acm.org

Deep reinforcement learning for information retrieval: Fundamentals and advances

W Zhang, X Zhao, L Zhao, D Yin, GH Yang… - Proceedings of the 43rd …, 2020 - dl.acm.org

Information retrieval (IR) techniques, such as search, recommendation and online
advertising, satisfying users' information needs by suggesting users personalized objects …

被引用次数：49 相关文章所有 5 个版本

[PDF] uva.nl

When people change their mind: Off-policy evaluation in non-stationary recommendation environments

R Jagerman, I Markov, M de Rijke - … conference on web search and data …, 2019 - dl.acm.org

We consider the novel problem of evaluating a recommendation policy offline in
environments where the reward signal is non-stationary. Non-stationarity appears in many …

被引用次数：71 相关文章所有 6 个版本

高级搜索

QQ 群