I4VGen: Image as Stepping Stone for Text-to-Video Generation

X Guo, J Liu, M Cui, D Huang - arXiv preprint arXiv:2406.02230, 2024 - arxiv.org
Text-to-video generation has lagged behind text-to-image synthesis in quality and diversity
due to the complexity of spatio-temporal modeling and limited video-text datasets. This …