L Qu, H Li, T Wang, W Wang, Y Li, L Nie… - arXiv preprint arXiv …, 2024 - arxiv.org
How humans can efficiently and effectively acquire images has always been a perennial question. A typical solution is text-to-image retrieval from an existing database given the text …
Z Wang, H Liu, J Yu, T Zhang, Y Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Amid the rising intersection of generative AI and human artistic processes, this study probes the critical yet less-explored terrain of alignment in human-centric automatic song …
In recent years, diffusion-based text-to-music (TTM) generation has gained prominence, offering a novel approach to synthesizing musical content from textual descriptions …