Exponential smoothing for off-policy learning

I Aouali, VE Brunel, D Rohde… - … Conference on Machine …, 2023 - proceedings.mlr.press
Off-policy learning (OPL) aims at finding improved policies from logged bandit data, often by
minimizing the inverse propensity scoring (IPS) estimator of the risk. In this work, we …

Adaptive importance sampling for heavy-tailed distributions via -divergence minimization

T Guilmeau, N Branchini… - International …, 2024 - proceedings.mlr.press
Adaptive importance sampling (AIS) algorithms are widely used to approximate expectations
with respect to complicated target probability distributions. When the target has heavy tails …

Sampling in Unit Time with Kernel Fisher-Rao Flow

A Maurais, Y Marzouk - arXiv preprint arXiv:2401.03892, 2024 - arxiv.org
We introduce a new mean-field ODE and corresponding interacting particle systems for
sampling from an unnormalized target density or Bayesian posterior. The interacting particle …

[PDF][PDF] A connection between tempering and entropic mirror descent

N Chopin, FR Crucinio, A Korba - arXiv preprint arXiv:2310.11914, 2023 - arxiv.org
This paper explores the connections between tempering (for Sequential Monte Carlo; SMC)
and entropic mirror descent to sample from a target probability distribution whose …

Composite likelihood inference for the Poisson log-normal model

J Stoehr, SS Robin - arXiv preprint arXiv:2402.14390, 2024 - arxiv.org
Inferring parameters of a latent variable model can be a daunting task when the conditional
distribution of the latent variables given the observed ones is intractable. Variational …

A quadrature rule combining control variates and adaptive importance sampling

R Leluc, F Portier, J Segers… - Advances in Neural …, 2022 - proceedings.neurips.cc
Driven by several successful applications such as in stochastic gradient descent or in
Bayesian computation, control variates have become a major tool for Monte Carlo …

Regularized R\'enyi divergence minimization through Bregman proximal gradient algorithms

T Guilmeau, E Chouzenoux, V Elvira - arXiv preprint arXiv:2211.04776, 2022 - arxiv.org
We study the variational inference problem of minimizing a regularized R\'enyi divergence
over an exponential family, and propose a relaxed moment-matching algorithm, which …

Information-Theoretic Generative Clustering of Documents

X Du, K Tanaka-Ishii - arXiv preprint arXiv:2412.13534, 2024 - arxiv.org
We present {\em generative clustering}(GC) for clustering a set of documents, $\mathrm {X}
$, by using texts $\mathrm {Y} $ generated by large language models (LLMs) instead of by …

Stochastic mirror descent for nonparametric adaptive importance sampling

P Bianchi, B Delyon, V Priser, F Portier - arXiv preprint arXiv:2409.13272, 2024 - arxiv.org
This paper addresses the problem of approximating an unknown probability distribution with
density $ f $--which can only be evaluated up to an unknown scaling factor--with the help of …

[PDF][PDF] Enhancing Monte Carlo integration by control variates and statistical learning

A Zhuman - 2024 - dial.uclouvain.be
Monte Carlo methods have their roots in the 1940s and are closely associated with
mathematician Stanisław Ulam. While recovering from an illness, Ulam was playing solitaire …