Stylegan-t: Unlocking the power of gans for fast large-scale text-to-image synthesis

A Sauer, T Karras, S Laine… - … on machine learning, 2023 - proceedings.mlr.press
Text-to-image synthesis has recently seen significant progress thanks to large pretrained
language models, large-scale training data, and the introduction of scalable model families …

Flow matching for generative modeling

Y Lipman, RTQ Chen, H Ben-Hamu, M Nickel… - arXiv preprint arXiv …, 2022 - arxiv.org
We introduce a new paradigm for generative modeling built on Continuous Normalizing
Flows (CNFs), allowing us to train CNFs at unprecedented scale. Specifically, we present …

Towards generalist biomedical AI

T Tu, S Azizi, D Driess, M Schaekermann, M Amin… - NEJM AI, 2024 - ai.nejm.org
Background Medicine is inherently multimodal, requiring the simultaneous interpretation
and integration of insights between many data modalities spanning text, imaging, genomics …

Planning with diffusion for flexible behavior synthesis

M Janner, Y Du, JB Tenenbaum, S Levine - arXiv preprint arXiv …, 2022 - arxiv.org
Model-based reinforcement learning methods often use learning only for the purpose of
estimating an approximate dynamics model, offloading the rest of the decision-making work …

Dreambooth3d: Subject-driven text-to-3d generation

A Raj, S Kaza, B Poole, M Niemeyer… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present DreamBooth3D, an approach to personalize text-to-3D generative models from
as few as 3-6 casually captured images of a subject. Our approach combines recent …

Generative novel view synthesis with 3d-aware diffusion models

ER Chan, K Nagano, MA Chan… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present a diffusion-based model for 3D-aware generative novel view synthesis from as
few as a single input image. Our model samples from the distribution of possible renderings …

Stable video diffusion: Scaling latent video diffusion models to large datasets

A Blattmann, T Dockhorn, S Kulal… - arXiv preprint arXiv …, 2023 - arxiv.org
We present Stable Video Diffusion-a latent video diffusion model for high-resolution, state-of-
the-art text-to-video and image-to-video generation. Recently, latent diffusion models trained …

Video diffusion models

J Ho, T Salimans, A Gritsenko… - Advances in …, 2022 - proceedings.neurips.cc
Generating temporally coherent high fidelity video is an important milestone in generative
modeling research. We make progress towards this milestone by proposing a diffusion …

T2i-compbench: A comprehensive benchmark for open-world compositional text-to-image generation

K Huang, K Sun, E Xie, Z Li… - Advances in Neural …, 2023 - proceedings.neurips.cc
Despite the stunning ability to generate high-quality images by recent text-to-image models,
current approaches often struggle to effectively compose objects with different attributes and …

Illuminating protein space with a programmable generative model

JB Ingraham, M Baranov, Z Costello, KW Barber… - Nature, 2023 - nature.com
Three billion years of evolution has produced a tremendous diversity of protein molecules,
but the full potential of proteins is likely to be much greater. Accessing this potential has …