The power of generative ai: A review of requirements, models, input–output formats, evaluation metrics, and challenges

A Bandi, PVSR Adapa, YEVPK Kuchi - Future Internet, 2023 - mdpi.com
Generative artificial intelligence (AI) has emerged as a powerful technology with numerous
applications in various domains. There is a need to identify the requirements and evaluation …

A review on generative adversarial networks: Algorithms, theory, and applications

J Gui, Z Sun, Y Wen, D Tao, J Ye - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Generative adversarial networks (GANs) have recently become a hot research topic;
however, they have been studied since 2014, and a large number of algorithms have been …

Vqgan-clip: Open domain image generation and editing with natural language guidance

K Crowson, S Biderman, D Kornis, D Stander… - … on Computer Vision, 2022 - Springer
Generating and editing images from open domain text prompts is a challenging task that
heretofore has required expensive and specially trained models. We demonstrate a novel …

Magicbrush: A manually annotated dataset for instruction-guided image editing

K Zhang, L Mo, W Chen, H Sun… - Advances in Neural …, 2024 - proceedings.neurips.cc
Text-guided image editing is widely needed in daily life, ranging from personal use to
professional applications such as Photoshop. However, existing methods are either zero …

Text2live: Text-driven layered image and video editing

O Bar-Tal, D Ofri-Amar, R Fridman, Y Kasten… - European conference on …, 2022 - Springer
We present a method for zero-shot, text-driven editing of natural images and videos. Given
an image or a video and a text prompt, our goal is to edit the appearance of existing objects …

Styleclip: Text-driven manipulation of stylegan imagery

O Patashnik, Z Wu, E Shechtman… - Proceedings of the …, 2021 - openaccess.thecvf.com
Inspired by the ability of StyleGAN to generate highly re-alistic images in a variety of
domains, much recent work hasfocused on understanding how to use the latent spaces …

Tedigan: Text-guided diverse face image generation and manipulation

W Xia, Y Yang, JH Xue, B Wu - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
In this work, we propose TediGAN, a novel framework for multi-modal image generation and
manipulation with textual descriptions. The proposed method consists of three components …

[HTML][HTML] An overview of deep learning in medical imaging focusing on MRI

AS Lundervold, A Lundervold - Zeitschrift für Medizinische Physik, 2019 - Elsevier
What has happened in machine learning lately, and what does it mean for the future of
medical image analysis? Machine learning has witnessed a tremendous amount of attention …

Dm-gan: Dynamic memory generative adversarial networks for text-to-image synthesis

M Zhu, P Pan, W Chen, Y Yang - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
In this paper, we focus on generating realistic images from text descriptions. Current
methods first generate an initial image with rough shape and color, and then refine the initial …

Generative adversarial networks in computer vision: A survey and taxonomy

Z Wang, Q She, TE Ward - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
Generative adversarial networks (GANs) have been extensively studied in the past few
years. Arguably their most significant impact has been in the area of computer vision where …