ChatGPT is not all you need. A State of the Art Review of large Generative AI models

R Gozalo-Brizuela, EC Garrido-Merchan - arXiv preprint arXiv:2301.04655, 2023 - arxiv.org
During the last two years there has been a plethora of large generative models such as
ChatGPT or Stable Diffusion that have been published. Concretely, these models are able to …

Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects

MU Hadi, Q Al Tashi, A Shah, R Qureshi… - Authorea …, 2024 - authorea.com
Within the vast expanse of computerized language processing, a revolutionary entity known
as Large Language Models (LLMs) has emerged, wielding immense power in its capacity to …

Scaling up gans for text-to-image synthesis

M Kang, JY Zhu, R Zhang, J Park… - Proceedings of the …, 2023 - openaccess.thecvf.com
The recent success of text-to-image synthesis has taken the world by storm and captured the
general public's imagination. From a technical standpoint, it also marked a drastic change in …

Generative multimodal models are in-context learners

Q Sun, Y Cui, X Zhang, F Zhang, Q Yu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Humans can easily solve multimodal tasks in context with only a few demonstrations or
simple instructions which current multimodal systems largely struggle to imitate. In this work …

Stylegan-t: Unlocking the power of gans for fast large-scale text-to-image synthesis

A Sauer, T Karras, S Laine… - … on machine learning, 2023 - proceedings.mlr.press
Text-to-image synthesis has recently seen significant progress thanks to large pretrained
language models, large-scale training data, and the introduction of scalable model families …

Elite: Encoding visual concepts into textual embeddings for customized text-to-image generation

Y Wei, Y Zhang, Z Ji, J Bai… - Proceedings of the …, 2023 - openaccess.thecvf.com
In addition to the unprecedented ability in imaginary creation, large text-to-image models are
expected to take customized concepts in image generation. Existing works generally learn …

Dreambooth3d: Subject-driven text-to-3d generation

A Raj, S Kaza, B Poole, M Niemeyer… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present DreamBooth3D, an approach to personalize text-to-3D generative models from
as few as 3-6 casually captured images of a subject. Our approach combines recent …

T2i-compbench: A comprehensive benchmark for open-world compositional text-to-image generation

K Huang, K Sun, E Xie, Z Li… - Advances in Neural …, 2023 - proceedings.neurips.cc
Despite the stunning ability to generate high-quality images by recent text-to-image models,
current approaches often struggle to effectively compose objects with different attributes and …

Gligen: Open-set grounded text-to-image generation

Y Li, H Liu, Q Wu, F Mu, J Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Large-scale text-to-image diffusion models have made amazing advances. However, the
status quo is to use text input alone, which can impede controllability. In this work, we …

Svdiff: Compact parameter space for diffusion fine-tuning

L Han, Y Li, H Zhang, P Milanfar… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recently, diffusion models have achieved remarkable success in text-to-image generation,
enabling the creation of high-quality images from text prompts and various conditions …