A survey on offline reinforcement learning: Taxonomy, review, and open problems

RF Prudencio, MROA Maximo… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
With the widespread adoption of deep learning, reinforcement learning (RL) has
experienced a dramatic increase in popularity, scaling to previously intractable problems …

Unsolved problems in ml safety

D Hendrycks, N Carlini, J Schulman… - arXiv preprint arXiv …, 2021 - arxiv.org
Machine learning (ML) systems are rapidly increasing in size, are acquiring new
capabilities, and are increasingly deployed in high-stakes settings. As with other powerful …

Biological sequence design with gflownets

M Jain, E Bengio, A Hernandez-Garcia… - International …, 2022 - proceedings.mlr.press
Abstract Design of de novo biological sequences with desired properties, like protein and
DNA sequences, often involves an active loop with several rounds of molecule ideation and …

Diffusion models for black-box optimization

S Krishnamoorthy, SM Mashkaria… - … on Machine Learning, 2023 - proceedings.mlr.press
The goal of offline black-box optimization (BBO) is to optimize an expensive black-box
function using a fixed dataset of function evaluations. Prior works consider forward …

Bidirectional learning for offline infinite-width model-based optimization

C Chen, Y Zhang, J Fu, XS Liu… - Advances in Neural …, 2022 - proceedings.neurips.cc
In offline model-based optimization, we strive to maximize a black-box objective function by
only leveraging a static dataset of designs and their scores. This problem setting arises in …

ExPT: synthetic pretraining for few-shot experimental design

T Nguyen, S Agrawal, A Grover - Advances in Neural …, 2024 - proceedings.neurips.cc
Experimental design is a fundamental problem in many science and engineering fields. In
this problem, sample efficiency is crucial due to the time, money, and safety costs of real …

Design from policies: Conservative test-time adaptation for offline policy optimization

J Liu, H Zhang, Z Zhuang, Y Kang… - Advances in Neural …, 2024 - proceedings.neurips.cc
In this work, we decouple the iterative bi-level offline RL (value estimation and policy
extraction) from the offline training phase, forming a non-iterative bi-level paradigm and …

Importance-aware co-teaching for offline model-based optimization

Y Yuan, CS Chen, Z Liu… - Advances in Neural …, 2024 - proceedings.neurips.cc
Offline model-based optimization aims to find a design that maximizes a property of interest
using only an offline dataset, with applications in robot, protein, and molecule design …

Unsupervised learning for combinatorial optimization with principled objective relaxation

HP Wang, N Wu, H Yang, C Hao… - Advances in Neural …, 2022 - proceedings.neurips.cc
Using machine learning to solve combinatorial optimization (CO) problems is challenging,
especially when the data is unlabeled. This work proposes an unsupervised learning …

Parallel-mentoring for offline model-based optimization

CS Chen, C Beckham, Z Liu… - Advances in Neural …, 2024 - proceedings.neurips.cc
We study offline model-based optimization to maximize a black-box objective function with a
static dataset of designs and scores. These designs encompass a variety of domains …