Instance-dependent near-optimal policy identification in linear mdps via online experiment design

A Wagenmaker, KG Jamieson - Advances in Neural …, 2022 - proceedings.neurips.cc
While much progress has been made in understanding the minimax sample complexity of
reinforcement learning (RL)---the complexity of learning on the worst-case''instance---such …

Proportional response: Contextual bandits for simple and cumulative regret minimization

SK Krishnamurthy, R Zhan, S Athey… - Advances in Neural …, 2023 - proceedings.neurips.cc
In many applications, eg in healthcare and e-commerce, the goal of a contextual bandit may
be to learn an optimal treatment assignment policy at the end of the experiment. That is, to …

Experiment planning with function approximation

A Pacchiano, J Lee, E Brunskill - Advances in Neural …, 2024 - proceedings.neurips.cc
We study the problem of experiment planning with function approximation in contextual
bandit problems. In settings where there is a significant overhead to deploying adaptive …

Experimental designs for heteroskedastic variance

J Weltz, T Fiez, A Volfovsky, E Laber… - Advances in …, 2024 - proceedings.neurips.cc
Most linear experimental design problems assume homogeneous variance, while the
presence of heteroskedastic noise is present in many realistic settings. Let a learner have …

Learning to Be Fair: A Consequentialist Approach to Equitable Decision Making

A Chohlas-Wood, M Coots, H Zhu… - Management …, 2024 - pubsonline.informs.org
In an attempt to make algorithms fair, the machine learning literature has largely focused on
equalizing decisions, outcomes, or error rates across race or gender groups. To illustrate …

Neural insights for digital marketing content design

F Kong, Y Li, H Nassif, T Fiez, R Henao… - Proceedings of the 29th …, 2023 - dl.acm.org
In digital marketing, experimenting with new website content is one of the key levers to
improve customer engagement. However, creating successful marketing content is a manual …

A data-driven state aggregation approach for dynamic discrete choice models

S Geng, H Nassif… - Uncertainty in Artificial …, 2023 - proceedings.mlr.press
In dynamic discrete choice models, a commonly studied problem is estimating parameters of
agent reward functions (also known as' structural'parameters) using agent behavioral data …

Optimal Exploration is no harder than Thompson Sampling

Z Li, K Jamieson, L Jain - International Conference on …, 2024 - proceedings.mlr.press
Given a set of arms $\mathcal {Z}\subset\mathbb {R}^ d $ and an unknown parameter vector
$\theta_\ast\in\mathbb {R}^ d $, the pure exploration linear bandits problem aims to return …

Darwin: Flexible learning-based cdn caching

J Chen, N Sharma, T Khan, S Liu, B Chang… - Proceedings of the …, 2023 - dl.acm.org
Cache management is critical for Content Delivery Networks (CDNs), impacting their
performance and operational costs. Most production CDNs apply static, hand-tuned caching …

A contextual ranking and selection method for personalized medicine

J Du, S Gao, CH Chen - Manufacturing & Service …, 2024 - pubsonline.informs.org
Problem definition: Personalized medicine (PM) seeks the best treatment for each patient
among a set of available treatment methods. Because a specific treatment does not work …