A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - International Journal of …, 2024 - Springer
Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …

Chatgpt and open-ai models: A preliminary review

KI Roumeliotis, ND Tselikas - Future Internet, 2023 - mdpi.com
According to numerous reports, ChatGPT represents a significant breakthrough in the field of
artificial intelligence. ChatGPT is a pre-trained AI model designed to engage in natural …

[HTML][HTML] Attention Is All You Need.(Nips), 2017

A Vaswani, N Shazeer, N Parmar, J Uszkoreit… - arXiv preprint arXiv …, 2017 - codetds.com
摘要占主导地位的序列转导模型基于复杂的递归或卷积神经网络, 包括编码器和解码器.
性能最好的模型还通过注意力机制连接编码器和解码器. 我们提出了一种新的简单网络架构 …

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Alpacafarm: A simulation framework for methods that learn from human feedback

Y Dubois, CX Li, R Taori, T Zhang… - Advances in …, 2024 - proceedings.neurips.cc
Large language models (LLMs) such as ChatGPT have seen widespread adoption due to
their ability to follow user instructions well. Developing these LLMs involves a complex yet …

Efficient memory management for large language model serving with pagedattention

W Kwon, Z Li, S Zhuang, Y Sheng, L Zheng… - Proceedings of the 29th …, 2023 - dl.acm.org
High throughput serving of large language models (LLMs) requires batching sufficiently
many requests at a time. However, existing systems struggle because the key-value cache …

Bloomberggpt: A large language model for finance

S Wu, O Irsoy, S Lu, V Dabravolski, M Dredze… - arXiv preprint arXiv …, 2023 - arxiv.org
The use of NLP in the realm of financial technology is broad and complex, with applications
ranging from sentiment analysis and named entity recognition to question answering. Large …

Rlaif: Scaling reinforcement learning from human feedback with ai feedback

H Lee, S Phatale, H Mansoor, KR Lu, T Mesnard… - 2023 - openreview.net
Reinforcement learning from human feedback (RLHF) is an effective technique for aligning
large language models (LLMs) to human preferences, but gathering high-quality human …

Challenges and applications of large language models

J Kaddour, J Harris, M Mozes, H Bradley… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

[PDF][PDF] Scaling autoregressive models for content-rich text-to-image generation

J Yu, Y Xu, JY Koh, T Luong, G Baid, Z Wang… - arXiv preprint arXiv …, 2022 - 3dvar.com
Abstract We present the Pathways [1] Autoregressive Text-to-Image (Parti) model, which
generates high-fidelity photorealistic images and supports content-rich synthesis involving …