Hyper-sd: Trajectory segmented consistency model for efficient image synthesis

Y Ren, X Xia, Y Lu, J Zhang, J Wu, P Xie… - arXiv preprint arXiv …, 2024 - arxiv.org
Recently, a series of diffusion-aware distillation algorithms have emerged to alleviate the
computational overhead associated with the multi-step inference process of Diffusion …

Distilling Diffusion Models into Conditional GANs

M Kang, R Zhang, C Barnes, S Paris, S Kwak… - arXiv preprint arXiv …, 2024 - arxiv.org
We propose a method to distill a complex multistep diffusion model into a single-step
conditional GAN student model, dramatically accelerating inference, while preserving image …

Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability

S Gao, J Yang, L Chen, K Chitta, Y Qiu… - arXiv preprint arXiv …, 2024 - arxiv.org
World models can foresee the outcomes of different actions, which is of paramount
importance for autonomous driving. Nevertheless, existing driving world models still have …

Improved Distribution Matching Distillation for Fast Image Synthesis

T Yin, M Gharbi, T Park, R Zhang, E Shechtman… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent approaches have shown promises distilling diffusion models into efficient one-step
generators. Among them, Distribution Matching Distillation (DMD) produces one-step …

Consistency Models Made Easy

Z Geng, A Pokle, W Luo, J Lin, JZ Kolter - arXiv preprint arXiv:2406.14548, 2024 - arxiv.org
Consistency models (CMs) are an emerging class of generative models that offer faster
sampling than traditional diffusion models. CMs enforce that all points along a sampling …

Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion

H Wen, Z Huang, Y Wang, X Chen, Y Qiao… - arXiv preprint arXiv …, 2024 - arxiv.org
Existing single image-to-3D creation methods typically involve a two-stage process, first
generating multi-view images, and then using these images for 3D reconstruction. However …

PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models

F Meng, W Shao, L Luo, Y Wang, Y Chen, Q Lu… - arXiv preprint arXiv …, 2024 - arxiv.org
Text-to-image (T2I) models have made substantial progress in generating images from
textual prompts. However, they frequently fail to produce images consistent with physical …

SF-V: Single Forward Video Generation Model

Z Zhang, Y Li, Y Wu, Y Xu, A Kag… - arXiv preprint arXiv …, 2024 - arxiv.org
Diffusion-based video generation models have demonstrated remarkable success in
obtaining high-fidelity videos through the iterative denoising process. However, these …

MLCM: Multistep Consistency Distillation of Latent Diffusion Model

Q Xie, Z Liao, Z Deng, S Tang, H Lu - arXiv preprint arXiv:2406.05768, 2024 - arxiv.org
Distilling large latent diffusion models (LDMs) into ones that are fast to sample from is
attracting growing research interest. However, the majority of existing methods face a …

StableMaterials: Enhancing Diversity in Material Generation via Semi-Supervised Learning

G Vecchio - arXiv preprint arXiv:2406.09293, 2024 - arxiv.org
We introduce StableMaterials, a novel approach for generating photorealistic physical-
based rendering (PBR) materials that integrate semi-supervised learning with Latent …