Introduction to multi-armed bandits

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com
Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

Optimal non-parametric learning in repeated contextual auctions with strategic buyer

A Drutsa - International Conference on Machine Learning, 2020 - proceedings.mlr.press
We study learning algorithms that optimize revenue in repeated contextual posted-price
auctions where a seller interacts with a single strategic buyer that seeks to maximize his …

Horizon-independent optimal pricing in repeated auctions with truthful and strategic buyers

A Drutsa - Proceedings of the 26th International Conference on …, 2017 - dl.acm.org
We study revenue optimization learning algorithms for repeated posted-price auctions
where a seller interacts with a (truthful or strategic) buyer that holds a fixed valuation. We …

Optimization of a ssp's header bidding strategy using thompson sampling

G Jauvion, N Grislain, P Dkengne Sielenou… - Proceedings of the 24th …, 2018 - dl.acm.org
Over the last decade, digital media (web or app publishers) generalized the use of real time
ad auctions to sell their ad spaces. Multiple auction platforms, also called Supply-Side …

Weakly consistent optimal pricing algorithms in repeated posted-price auctions with strategic buyer

A Drutsa - International Conference on Machine Learning, 2018 - proceedings.mlr.press
We study revenue optimization learning algorithms for repeated posted-price auctions
where a seller interacts with a single strategic buyer that holds a fixed private valuation for a …

Optimal pricing in repeated posted-price auctions with different patience of the seller and the buyer

A Vanunts, A Drutsa - Advances in Neural Information …, 2019 - proceedings.neurips.cc
We study revenue optimization pricing algorithms for repeated posted-price auctions where
a seller interacts with a single strategic buyer that holds a fixed private valuation. When the …

On the robustness of epoch-greedy in multi-agent contextual bandit mechanisms

Y Xu, B Kumar, J Abernethy - arXiv preprint arXiv:2307.07675, 2023 - arxiv.org
Efficient learning in multi-armed bandit mechanisms such as pay-per-click (PPC) auctions
typically involves three challenges: 1) inducing truthful bidding behavior (incentives), 2) …

Reserve pricing in repeated second-price auctions with strategic bidders

A Drutsa - International Conference on Machine Learning, 2020 - proceedings.mlr.press
We study revenue optimization learning algorithms for repeated second-price auctions with
reserve where a seller interacts with multiple strategic bidders each of which holds a fixed …

Low Revenue in Display Ad Auctions: Algorithmic Collusion vs. Non-Quasilinear Preferences

M Bichler, A Gupta, L Mathews… - arXiv preprint arXiv …, 2023 - arxiv.org
The transition of display ad exchanges from second-price to first-price auctions has raised
questions about its impact on revenue. Evaluating this shift empirically proves challenging …

On consistency of optimal pricing algorithms in repeated posted-price auctions with strategic buyer

A Drutsa - arXiv preprint arXiv:1707.05101, 2017 - arxiv.org
We study revenue optimization learning algorithms for repeated posted-price auctions
where a seller interacts with a single strategic buyer that holds a fixed private valuation for a …