A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt

Y Cao, S Li, Y Liu, Z Yan, Y Dai, PS Yu… - arXiv preprint arXiv …, 2023 - arxiv.org
Recently, ChatGPT, along with DALL-E-2 and Codex, has been gaining significant attention
from society. As a result, many individuals have become interested in related resources and …

A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?

C Zhang, C Zhang, S Zheng, Y Qiao, C Li… - arXiv preprint arXiv …, 2023 - arxiv.org
As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …

Null-text inversion for editing real images using guided diffusion models

R Mokady, A Hertz, K Aberman… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent large-scale text-guided diffusion models provide powerful image generation
capabilities. Currently, a massive effort is given to enable the modification of these images …

Guiding pretraining in reinforcement learning with large language models

Y Du, O Watkins, Z Wang, C Colas… - International …, 2023 - proceedings.mlr.press
Reinforcement learning algorithms typically struggle in the absence of a dense, well-shaped
reward function. Intrinsically motivated exploration methods address this limitation by …

Clipcap: Clip prefix for image captioning

R Mokady, A Hertz, AH Bermano - arXiv preprint arXiv:2111.09734, 2021 - arxiv.org
Image captioning is a fundamental task in vision-language understanding, where the model
predicts a textual informative caption to a given input image. In this paper, we present a …

4d-fy: Text-to-4d generation using hybrid score distillation sampling

S Bahmani, I Skorokhodov, V Rong… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent breakthroughs in text-to-4D generation rely on pre-trained text-to-image and text-to-
video models to generate dynamic 3D scenes. However current text-to-4D methods face a …

Translation between molecules and natural language

C Edwards, T Lai, K Ros, G Honke, K Cho… - arXiv preprint arXiv …, 2022 - arxiv.org
We present $\textbf {MolT5} $$-$ a self-supervised learning framework for pretraining
models on a vast amount of unlabeled natural language text and molecule strings. $\textbf …

Quality not quantity: On the interaction between dataset design and robustness of clip

T Nguyen, G Ilharco, M Wortsman… - Advances in Neural …, 2022 - proceedings.neurips.cc
Web-crawled datasets have enabled remarkable generalization capabilities in recent image-
text models such as CLIP (Contrastive Language-Image pre-training) or Flamingo, but little …

Advances in medical image analysis with vision transformers: a comprehensive review

R Azad, A Kazerouni, M Heidari, EK Aghdam… - Medical Image …, 2023 - Elsevier
The remarkable performance of the Transformer architecture in natural language processing
has recently also triggered broad interest in Computer Vision. Among other merits …

Deep learning: Systematic review, models, challenges, and research directions

T Talaei Khoei, H Ould Slimane… - Neural Computing and …, 2023 - Springer
The current development in deep learning is witnessing an exponential transition into
automation applications. This automation transition can provide a promising framework for …