Simple regret minimization for contextual bandits

B Hao, T Lattimore, M Wang - Advances in Neural …, 2020 - proceedings.neurips.cc

Stochastic linear bandits with high-dimensional sparse features are a practical model for a
variety of domains, such as personalized medicine and online advertising. We derive a …

被引用次数：67 相关文章所有 9 个版本

[PDF] neurips.cc

Optimal order simple regret for Gaussian process bandits

S Vakili, N Bouziani, S Jalali… - Advances in Neural …, 2021 - proceedings.neurips.cc

Consider the sequential optimization of a continuous, possibly non-convex, and expensive
to evaluate objective function $ f $. The problem can be cast as a Gaussian Process (GP) …

被引用次数：46 相关文章所有 7 个版本

[PDF] neurips.cc

Instance-optimal pac algorithms for contextual bandits

Z Li, L Ratliff, KG Jamieson… - Advances in Neural …, 2022 - proceedings.neurips.cc

In the stochastic contextual bandit setting, regret-minimizing algorithms have been
extensively researched, but their instance-minimizing best-arm identification counterparts …

被引用次数：27 相关文章所有 12 个版本

[PDF] neurips.cc

Design of experiments for stochastic contextual linear bandits

A Zanette, K Dong, JN Lee… - Advances in Neural …, 2021 - proceedings.neurips.cc

In the stochastic linear contextual bandit setting there exist several minimax procedures for
exploration with policies that are reactive to the data being acquired. In practice, there can …

被引用次数：29 相关文章所有 8 个版本

[PDF] wiley.com

Enabling boomless CubeSat magnetic field measurements with the Quad‐Mag magnetometer and an improved underdetermined blind source separation algorithm

AP Hoffmann, MB Moldwin, BP Strabel… - Journal of …, 2023 - Wiley Online Library

In situ magnetic field measurements are often difficult to obtain due to the presence of stray
magnetic fields generated by spacecraft electrical subsystems. The conventional solution is …

被引用次数：6 相关文章所有 2 个版本

[PDF] wiley.com Full View

Separation of spacecraft noise from geomagnetic field observations through density‐based cluster analysis and compressive sensing

AP Hoffmann, MB Moldwin - Journal of Geophysical Research …, 2022 - Wiley Online Library

The use of magnetometers for space exploration is inhibited by magnetic noise generated
by spacecraft electrical systems. Mechanical booms are traditionally used to extend …

被引用次数：14 相关文章所有 15 个版本

[PDF] ieee.org

Wavelet-Adaptive Interference Cancellation for Underdetermined Platforms: Enhancing Boomless Magnetic Field Measurements on Compact Spacecraft

AP Hoffmann, MB Moldwin - IEEE Transactions on Aerospace …, 2023 - ieeexplore.ieee.org

Spacecraft magnetic field measurements are frequently degraded by stray magnetic fields
originating from onboard electrical systems. These interference signals can mask the natural …

被引用次数：4 相关文章所有 6 个版本

[PDF] arxiv.org

The role of contextual information in best arm identification

M Kato, K Ariu - arXiv preprint arXiv:2106.14077, 2021 - arxiv.org

We study the best-arm identification problem with fixed confidence when contextual
(covariate) information is available in stochastic bandits. Although we can use contextual …

被引用次数：13 相关文章所有 6 个版本

[PDF] github.io

[PDF][PDF] Simple regret minimization for contextual bandits using bayesian optimal experimental design

M Jörke, J Lee, E Brunskill - … Design and Active Learning in the …, 2022 - realworldml.github.io

We study the best policy identification problem for contextual bandits through the lens of
Bayesian optimal experimental design. Motivated by practical constraints when deploying …

被引用次数：5 相关文章

[PDF] arxiv.org

Offline RL with resource constrained online deployment

JR Regatti, AA Deshmukh, F Cheng, YH Jung… - arXiv preprint arXiv …, 2021 - arxiv.org

Offline reinforcement learning is used to train policies in scenarios where real-time access to
the environment is expensive or impossible. As a natural consequence of these harsh …

被引用次数：4 相关文章所有 3 个版本

高级搜索

QQ 群