A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt

Y Cao, S Li, Y Liu, Z Yan, Y Dai, PS Yu… - arXiv preprint arXiv …, 2023 - arxiv.org
Recently, ChatGPT, along with DALL-E-2 and Codex, has been gaining significant attention
from society. As a result, many individuals have become interested in related resources and …

ChatGPT for shaping the future of dentistry: the potential of multi-modal large language model

H Huang, O Zheng, D Wang, J Yin, Z Wang… - International Journal of …, 2023 - nature.com
The ChatGPT, a lite and conversational variant of Generative Pretrained Transformer 4 (GPT-
4) developed by OpenAI, is one of the milestone Large Language Models (LLMs) with …

Visual instruction tuning

H Liu, C Li, Q Wu, YJ Lee - Advances in neural information …, 2024 - proceedings.neurips.cc
Instruction tuning large language models (LLMs) using machine-generated instruction-
following data has been shown to improve zero-shot capabilities on new tasks, but the idea …

Improved baselines with visual instruction tuning

H Liu, C Li, Y Li, YJ Lee - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Large multimodal models (LMM) have recently shown encouraging progress with visual
instruction tuning. In this paper we present the first systematic study to investigate the design …

Instructblip: Towards general-purpose vision-language models with instruction tuning

W Dai, J Li, D Li, AMH Tiong, J Zhao… - Advances in …, 2024 - proceedings.neurips.cc
Large-scale pre-training and instruction tuning have been successful at creating general-
purpose language models with broad competence. However, building general-purpose …

mplug-owl: Modularization empowers large language models with multimodality

Q Ye, H Xu, G Xu, J Ye, M Yan, Y Zhou, J Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated impressive zero-shot abilities on a
variety of open-ended tasks, while recent research has also explored the use of LLMs for …

Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face

Y Shen, K Song, X Tan, D Li, W Lu… - Advances in Neural …, 2024 - proceedings.neurips.cc
Solving complicated AI tasks with different domains and modalities is a key step toward
artificial general intelligence. While there are numerous AI models available for various …

Llama-adapter: Efficient fine-tuning of language models with zero-init attention

R Zhang, J Han, C Liu, P Gao, A Zhou, X Hu… - arXiv preprint arXiv …, 2023 - arxiv.org
We present LLaMA-Adapter, a lightweight adaption method to efficiently fine-tune LLaMA
into an instruction-following model. Using 52K self-instruct demonstrations, LLaMA-Adapter …

Minigpt-4: Enhancing vision-language understanding with advanced large language models

D Zhu, J Chen, X Shen, X Li, M Elhoseiny - arXiv preprint arXiv …, 2023 - arxiv.org
The recent GPT-4 has demonstrated extraordinary multi-modal abilities, such as directly
generating websites from handwritten text and identifying humorous elements within …

Lisa: Reasoning segmentation via large language model

X Lai, Z Tian, Y Chen, Y Li, Y Yuan… - Proceedings of the …, 2024 - openaccess.thecvf.com
Although perception systems have made remarkable advancements in recent years they still
rely on explicit human instruction or pre-defined categories to identify the target objects …