On the scalability of diffusion-based text-to-image generation

文章

学术资源搜索

获得 4 条结果（用时0.01秒）

我的图书馆

On the scalability of diffusion-based text-to-image generation

在引用文章中搜索

[PDF] arxiv.org

Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space

CF Park, M Okawa, A Lee, ES Lubana… - arXiv preprint arXiv …, 2024 - arxiv.org

Modern generative models demonstrate impressive capabilities, likely stemming from an
ability to identify and manipulate abstract concepts underlying their training data. However …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Unified Text-to-Image Generation and Retrieval

L Qu, H Li, T Wang, W Wang, Y Li, L Nie… - arXiv preprint arXiv …, 2024 - arxiv.org

How humans can efficiently and effectively acquire images has always been a perennial
question. A typical solution is text-to-image retrieval from an existing database given the text …

MuDiT & MuSiT: Alignment with Colloquial Expression in Description-to-Song Generation

Z Wang, H Liu, J Yu, T Zhang, Y Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

Amid the rising intersection of generative AI and human artistic processes, this study probes
the critical yet less-explored terrain of alignment in human-centric automatic song …

Quality-aware Masked Diffusion Transformer for Enhanced Music Generation

C Li, R Wang, L Liu, J Du, Y Sun, Z Guo… - arXiv preprint arXiv …, 2024 - arxiv.org

In recent years, diffusion-based text-to-music (TTM) generation has gained prominence,
offering a novel approach to synthesizing musical content from textual descriptions …

高级搜索

QQ 群

On the scalability of diffusion-based text-to-image generation

Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space

Unified Text-to-Image Generation and Retrieval

MuDiT & MuSiT: Alignment with Colloquial Expression in Description-to-Song Generation

Quality-aware Masked Diffusion Transformer for Enhanced Music Generation

引用