Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space

CF Park, M Okawa, A Lee, ES Lubana… - arXiv preprint arXiv …, 2024 - arxiv.org
Modern generative models demonstrate impressive capabilities, likely stemming from an
ability to identify and manipulate abstract concepts underlying their training data. However …

Unified Text-to-Image Generation and Retrieval

L Qu, H Li, T Wang, W Wang, Y Li, L Nie… - arXiv preprint arXiv …, 2024 - arxiv.org
How humans can efficiently and effectively acquire images has always been a perennial
question. A typical solution is text-to-image retrieval from an existing database given the text …

MuDiT & MuSiT: Alignment with Colloquial Expression in Description-to-Song Generation

Z Wang, H Liu, J Yu, T Zhang, Y Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Amid the rising intersection of generative AI and human artistic processes, this study probes
the critical yet less-explored terrain of alignment in human-centric automatic song …

Quality-aware Masked Diffusion Transformer for Enhanced Music Generation

C Li, R Wang, L Liu, J Du, Y Sun, Z Guo… - arXiv preprint arXiv …, 2024 - arxiv.org
In recent years, diffusion-based text-to-music (TTM) generation has gained prominence,
offering a novel approach to synthesizing musical content from textual descriptions …