- 学术资源搜索

ChatGPT is not all you need. A State of the Art Review of large Generative AI models

R Gozalo-Brizuela, EC Garrido-Merchan - arXiv preprint arXiv:2301.04655, 2023 - arxiv.org

During the last two years there has been a plethora of large generative models such as
ChatGPT or Stable Diffusion that have been published. Concretely, these models are able to …

被引用次数：339 相关文章所有 5 个版本

[PDF] authorea.com

Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects

MU Hadi, Q Al Tashi, A Shah, R Qureshi… - Authorea …, 2024 - authorea.com

Within the vast expanse of computerized language processing, a revolutionary entity known
as Large Language Models (LLMs) has emerged, wielding immense power in its capacity to …

被引用次数：128 相关文章所有 5 个版本

[PDF] thecvf.com

Scaling up gans for text-to-image synthesis

M Kang, JY Zhu, R Zhang, J Park… - Proceedings of the …, 2023 - openaccess.thecvf.com

The recent success of text-to-image synthesis has taken the world by storm and captured the
general public's imagination. From a technical standpoint, it also marked a drastic change in …

被引用次数：368 相关文章所有 6 个版本

[PDF] thecvf.com

Generative multimodal models are in-context learners

Q Sun, Y Cui, X Zhang, F Zhang, Q Yu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Humans can easily solve multimodal tasks in context with only a few demonstrations or
simple instructions which current multimodal systems largely struggle to imitate. In this work …

被引用次数：110 相关文章所有 3 个版本

[PDF] mlr.press

Stylegan-t: Unlocking the power of gans for fast large-scale text-to-image synthesis

A Sauer, T Karras, S Laine… - … on machine learning, 2023 - proceedings.mlr.press

Text-to-image synthesis has recently seen significant progress thanks to large pretrained
language models, large-scale training data, and the introduction of scalable model families …

被引用次数：183 相关文章所有 8 个版本

[PDF] thecvf.com

Elite: Encoding visual concepts into textual embeddings for customized text-to-image generation

Y Wei, Y Zhang, Z Ji, J Bai… - Proceedings of the …, 2023 - openaccess.thecvf.com

In addition to the unprecedented ability in imaginary creation, large text-to-image models are
expected to take customized concepts in image generation. Existing works generally learn …

被引用次数：210 相关文章所有 5 个版本

[PDF] thecvf.com

Dreambooth3d: Subject-driven text-to-3d generation

A Raj, S Kaza, B Poole, M Niemeyer… - Proceedings of the …, 2023 - openaccess.thecvf.com

We present DreamBooth3D, an approach to personalize text-to-3D generative models from
as few as 3-6 casually captured images of a subject. Our approach combines recent …

被引用次数：159 相关文章所有 5 个版本

[PDF] neurips.cc

T2i-compbench: A comprehensive benchmark for open-world compositional text-to-image generation

K Huang, K Sun, E Xie, Z Li… - Advances in Neural …, 2023 - proceedings.neurips.cc

Despite the stunning ability to generate high-quality images by recent text-to-image models,
current approaches often struggle to effectively compose objects with different attributes and …

被引用次数：90 相关文章所有 6 个版本

[PDF] thecvf.com

Gligen: Open-set grounded text-to-image generation

Y Li, H Liu, Q Wu, F Mu, J Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Large-scale text-to-image diffusion models have made amazing advances. However, the
status quo is to use text input alone, which can impede controllability. In this work, we …

被引用次数：189 相关文章所有 5 个版本

[PDF] thecvf.com

Svdiff: Compact parameter space for diffusion fine-tuning

L Han, Y Li, H Zhang, P Milanfar… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recently, diffusion models have achieved remarkable success in text-to-image generation,
enabling the creation of high-quality images from text prompts and various conditions …

被引用次数：148 相关文章所有 9 个版本

高级搜索

QQ 群

ChatGPT is not all you need. A State of the Art Review of large Generative AI models

Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects

Scaling up gans for text-to-image synthesis

Generative multimodal models are in-context learners

Stylegan-t: Unlocking the power of gans for fast large-scale text-to-image synthesis

Elite: Encoding visual concepts into textual embeddings for customized text-to-image generation

Dreambooth3d: Subject-driven text-to-3d generation

T2i-compbench: A comprehensive benchmark for open-world compositional text-to-image generation

Gligen: Open-set grounded text-to-image generation

Svdiff: Compact parameter space for diffusion fine-tuning

引用