Consider the sequential optimization of a continuous, possibly non-convex, and expensive to evaluate objective function $ f $. The problem can be cast as a Gaussian Process (GP) …
In the stochastic contextual bandit setting, regret-minimizing algorithms have been extensively researched, but their instance-minimizing best-arm identification counterparts …
In the stochastic linear contextual bandit setting there exist several minimax procedures for exploration with policies that are reactive to the data being acquired. In practice, there can …
In situ magnetic field measurements are often difficult to obtain due to the presence of stray magnetic fields generated by spacecraft electrical subsystems. The conventional solution is …
AP Hoffmann, MB Moldwin - Journal of Geophysical Research …, 2022 - Wiley Online Library
The use of magnetometers for space exploration is inhibited by magnetic noise generated by spacecraft electrical systems. Mechanical booms are traditionally used to extend …
AP Hoffmann, MB Moldwin - IEEE Transactions on Aerospace …, 2023 - ieeexplore.ieee.org
Spacecraft magnetic field measurements are frequently degraded by stray magnetic fields originating from onboard electrical systems. These interference signals can mask the natural …
M Kato, K Ariu - arXiv preprint arXiv:2106.14077, 2021 - arxiv.org
We study the best-arm identification problem with fixed confidence when contextual (covariate) information is available in stochastic bandits. Although we can use contextual …
M Jörke, J Lee, E Brunskill - … Design and Active Learning in the …, 2022 - realworldml.github.io
We study the best policy identification problem for contextual bandits through the lens of Bayesian optimal experimental design. Motivated by practical constraints when deploying …
Offline reinforcement learning is used to train policies in scenarios where real-time access to the environment is expensive or impossible. As a natural consequence of these harsh …