Large pre-trained models, also known as foundation models (FMs), are trained in a task- agnostic manner on large-scale data and can be adapted to a wide range of downstream …
A Sauer, T Karras, S Laine… - … on machine learning, 2023 - proceedings.mlr.press
Text-to-image synthesis has recently seen significant progress thanks to large pretrained language models, large-scale training data, and the introduction of scalable model families …
Recent advances in personalized image generation have enabled pre-trained text-to-image models to learn new concepts from specific image sets. However these methods often …
Large-scale text-to-image diffusion models can generate high-fidelity images with powerful compositional ability. However, these models are typically trained on an enormous amount …
Neural compression is the application of neural networks and other machine learning methods to data compression. Recent advances in statistical machine learning have opened …
We investigate the potential of learning visual representations using synthetic images generated by text-to-image models. This is a natural question in the light of the excellent …
The stunning qualitative improvement of text-to-image models has led to their widespread attention and adoption. However, we lack a comprehensive quantitative understanding of …
A Sauer, D Lorenz, A Blattmann… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce Adversarial Diffusion Distillation (ADD), a novel training approach that efficiently samples large-scale foundational image diffusion models in just 1-4 steps while …
The most advanced text-to-image (T2I) models require significant training costs (eg, millions of GPU hours), seriously hindering the fundamental innovation for the AIGC community …