Evaluating and Improving Compositional Text-to-Visual Generation

B Li, Z Lin, D Pathak, J Li, Y Fei, K Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com
While text-to-visual models now produce photo-realistic images and videos they struggle
with compositional text prompts involving attributes relationships and higher-order …

Prompt Optimization with Human Feedback

X Lin, Z Dai, A Verma, SK Ng, P Jaillet… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) have demonstrated remarkable performances in various
tasks. However, the performance of LLMs heavily depends on the input prompt, which has …

GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation

B Li, Z Lin, D Pathak, J Li, Y Fei, K Wu, T Ling… - arXiv preprint arXiv …, 2024 - arxiv.org
While text-to-visual models now produce photo-realistic images and videos, they struggle
with compositional text prompts involving attributes, relationships, and higher-order …

Model-Agnostic Human Preference Inversion in Diffusion Models

J Kim, Z Wang, Q Qiu - arXiv preprint arXiv:2404.00879, 2024 - arxiv.org
Efficient text-to-image generation remains a challenging task due to the high computational
costs associated with the multi-step sampling in diffusion models. Although distillation of pre …