Evaluating text-to-visual generation with image-to-text generation

Z Lin, D Pathak, B Li, J Li, X Xia, G Neubig… - arXiv preprint arXiv …, 2024 - arxiv.org
Despite significant progress in generative AI, comprehensive evaluation remains
challenging because of the lack of effective metrics and standardized benchmarks. For …

Evaluating Text to Image Synthesis: Survey and Taxonomy of Image Quality Metrics

S Hartwig, D Engel, L Sick, H Kniesel, T Payer… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advances in text-to-image synthesis have been enabled by exploiting a combination
of language and vision through foundation models. These models are pre-trained on …

Versat2i: Improving text-to-image models with versatile reward

J Guo, W Chai, J Deng, HW Huang, T Ye, Y Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent text-to-image (T2I) models have benefited from large-scale and high-quality data,
demonstrating impressive performance. However, these T2I models still struggle to produce …

TlTScore: Towards Long-Tail Effects in Text-to-Visual Evaluation with Generative Foundation Models

P Ji, J Liu - Proceedings of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Abstract Evaluation of generative foundation models (GenFMs) for text-to-visual tasks has
been enhanced by automatic alignment metrics such as CLIPScore complementing human …

GenAI Arena: An Open Evaluation Platform for Generative Models

D Jiang, M Ku, T Li, Y Ni, S Sun, R Fan… - arXiv preprint arXiv …, 2024 - arxiv.org
Generative AI has made remarkable strides to revolutionize fields such as image and video
generation. These advancements are driven by innovative algorithms, architecture, and …

DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation

Y Peng, Y Cui, H Tang, Z Qi, R Dong, J Bai… - arXiv preprint arXiv …, 2024 - arxiv.org
Personalized image generation holds great promise in assisting humans in everyday work
and life due to its impressive function in creatively generating personalized content …

MantisScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation

X He, D Jiang, G Zhang, M Ku, A Soni, S Siu… - arXiv preprint arXiv …, 2024 - arxiv.org
The recent years have witnessed great advances in video generation. However, the
development of automatic video metrics is lagging significantly behind. None of the existing …

Learning Action and Reasoning-Centric Image Editing from Videos and Simulations

B Krojer, D Vattikonda, L Lara, V Jampani… - arXiv preprint arXiv …, 2024 - arxiv.org
An image editing model should be able to perform diverse edits, ranging from object
replacement, changing attributes or style, to performing actions or movement, which require …