K-sort arena: Efficient and reliable benchmarking for generative models via k-wise human preferences

Z Li, X Liu, D Fu, J Li, Q Gu, K Keutzer… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid advancement of visual generative models necessitates efficient and reliable
evaluation methods. Arena platform, which gathers user votes on model comparisons, can …

Diversifying the expert knowledge for task-agnostic pruning in sparse mixture-of-experts

Z Zhang, X Liu, H Cheng, C Xu, J Gao - arXiv preprint arXiv:2407.09590, 2024 - arxiv.org
By increasing model parameters but activating them sparsely when performing a task, the
use of Mixture-of-Experts (MoE) architecture significantly improves the performance of Large …

Unveiling Redundancy in Diffusion Transformers (DiTs): A Systematic Study

X Sun, J Fang, A Li, J Pan - arXiv preprint arXiv:2411.13588, 2024 - arxiv.org
The increased model capacity of Diffusion Transformers (DiTs) and the demand for
generating higher resolutions of images and videos have led to a significant rise in inference …

LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

X Shen, Z Song, Y Zhou, B Chen, Y Li, Y Gong… - arXiv preprint arXiv …, 2024 - arxiv.org
Diffusion Transformers have emerged as the preeminent models for a wide array of
generative tasks, demonstrating superior performance and efficacy across various …

TinyFusion: Diffusion Transformers Learned Shallow

G Fang, K Li, X Ma, X Wang - arXiv preprint arXiv:2412.01199, 2024 - arxiv.org
Diffusion Transformers have demonstrated remarkable capabilities in image generation but
often come with excessive parameterization, resulting in considerable inference overhead in …

Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching

X Ma, G Fang, MB Mi, X Wang - arXiv preprint arXiv:2406.01733, 2024 - arxiv.org
Diffusion Transformers have recently demonstrated unprecedented generative capabilities
for various tasks. The encouraging results, however, come with the cost of slow inference …

Effortless Efficiency: Low-Cost Pruning of Diffusion Models

Y Zhang, E Jin, Y Dong, A Khakzar, P Torr… - arXiv preprint arXiv …, 2024 - arxiv.org
Diffusion models have achieved impressive advancements in various vision tasks. However,
these gains often rely on increasing model size, which escalates computational complexity …

SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training

D Hu, J Chen, X Huang, H Coskun, A Sahni… - arXiv preprint arXiv …, 2024 - arxiv.org
Existing text-to-image (T2I) diffusion models face several limitations, including large model
sizes, slow runtime, and low-quality generation on mobile devices. This paper aims to …

DiffusionTrend: A Minimalist Approach to Virtual Fashion Try-On

W Zhan, M Lin, S Yan, R Ji - arXiv preprint arXiv:2412.14465, 2024 - arxiv.org
We introduce DiffusionTrend for virtual fashion try-on, which forgoes the need for retraining
diffusion models. Using advanced diffusion models, DiffusionTrend harnesses latent …

Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion

SN Ramesh, Z Zhao - arXiv preprint arXiv:2411.15113, 2024 - arxiv.org
As text-to-image models grow increasingly powerful and complex, their burgeoning size
presents a significant obstacle to widespread adoption, especially on resource-constrained …