A review of synthetic image data and its use in computer vision

K Man, J Chahl - Journal of Imaging, 2022 - mdpi.com
Development of computer vision algorithms using convolutional neural networks and deep
learning has necessitated ever greater amounts of annotated and labelled data to produce …

On the detection of synthetic images generated by diffusion models

R Corvi, D Cozzolino, G Zingarini… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Over the past decade, there has been tremendous progress in creating synthetic media,
mainly thanks to the development of powerful methods based on generative adversarial …

Intriguing properties of synthetic images: from generative adversarial networks to diffusion models

R Corvi, D Cozzolino, G Poggi… - Proceedings of the …, 2023 - openaccess.thecvf.com
Detecting fake images is becoming a major goal of computer vision. This need is becoming
more and more pressing with the continuous improvement of synthesis methods based on …

Large language models and political science

M Linegar, R Kocielnik, RM Alvarez - Frontiers in Political Science, 2023 - frontiersin.org
Large Language Models (LLMs) are a type of artificial intelligence that uses information from
very large datasets to model the use of language and generate content. While LLMs like …

3DALL-E: Integrating text-to-image AI in 3D design workflows

V Liu, J Vermeulen, G Fitzmaurice… - Proceedings of the 2023 …, 2023 - dl.acm.org
Text-to-image AI are capable of generating novel images for inspiration, but their
applications for 3D design workflows and how designers can build 3D models using AI …

Opal: Multimodal image generation for news illustration

V Liu, H Qiao, L Chilton - Proceedings of the 35th Annual ACM …, 2022 - dl.acm.org
Advances in multimodal AI have presented people with powerful ways to create images from
text. Recent work has shown that text-to-image generations are able to represent a broad …

Clipface: Text-guided editing of textured 3d morphable models

S Aneja, J Thies, A Dai, M Nießner - ACM SIGGRAPH 2023 Conference …, 2023 - dl.acm.org
We propose ClipFace, a novel self-supervised approach for text-guided editing of textured
3D morphable model of faces. Specifically, we employ user-friendly language prompts to …

How well can text-to-image generative models understand ethical natural language interventions?

H Bansal, D Yin, M Monajatipoor, KW Chang - arXiv preprint arXiv …, 2022 - arxiv.org
Text-to-image generative models have achieved unprecedented success in generating high-
quality images based on natural language descriptions. However, it is shown that these …

Geneval: An object-focused framework for evaluating text-to-image alignment

D Ghosh, H Hajishirzi… - Advances in Neural …, 2024 - proceedings.neurips.cc
Recent breakthroughs in diffusion models, multimodal pretraining, and efficient finetuning
have led to an explosion of text-to-image generative models. Given human evaluation is …

No" zero-shot" without exponential data: Pretraining concept frequency determines multimodal model performance

V Udandarao, A Prabhu, A Ghosh… - The Thirty-eighth …, 2024 - openreview.net
Web-crawled pretraining datasets underlie the impressive" zero-shot" evaluation
performance of multimodal models, such as CLIP for classification and Stable-Diffusion for …