Recent years have seen an explosion of work and interest in text‐to‐3D shape generation. Much of the progress is driven by advances in 3D representations, large‐scale pretraining …
Q Ma, X Ning, D Liu, L Niu, L Zhang - arXiv preprint arXiv:2410.06664, 2024 - arxiv.org
Diffusion models are trained by learning a sequence of models that reverse each step of noise corruption. Typically, the model parameters are fully shared across multiple timesteps …
The intensive computational burden of Stable Diffusion (SD) for text-to-image generation poses a significant hurdle for its practical application. To tackle this challenge, recent …
X Yang, X Wang - arXiv preprint arXiv:2404.06091, 2024 - arxiv.org
The evolution of 3D generative modeling has been notably propelled by the adoption of 2D diffusion models. Despite this progress, the cumbersome optimization process per se …
Diffusion Transformers (DiTs) have achieved state-of-the-art (SOTA) image generation quality but suffer from high latency and memory inefficiency, making them difficult to deploy …
Y Wu, H Wang, Z Chen, D Xu - arXiv preprint arXiv:2411.18375, 2024 - arxiv.org
The high computational cost and slow inference time are major obstacles to deploying the video diffusion model (VDM) in practical applications. To overcome this, we introduce a new …
Text-to-video generation enhances content creation but is highly computationally intensive: The computational cost of Diffusion Transformers (DiTs) scales quadratically in the number …
Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has …
Recent advancements in text-to-image diffusion models have enabled the personalization of these models to generate custom images from textual prompts. This paper presents an …