Z Ding, P Li, Q Yang, S Li - arXiv preprint arXiv:2406.01956, 2024 - arxiv.org
This paper presents a novel approach to enhance image-to-image generation by leveraging
the multimodal capabilities of the Large Language and Vision Assistant (LLaVA). We …