A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Harnessing the power of llms in practice: A survey on chatgpt and beyond

J Yang, H Jin, R Tang, X Han, Q Feng, H Jiang… - ACM Transactions on …, 2024 - dl.acm.org
This article presents a comprehensive and practical guide for practitioners and end-users
working with Large Language Models (LLMs) in their downstream Natural Language …

Qwen technical report

J Bai, S Bai, Y Chu, Z Cui, K Dang, X Deng… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have revolutionized the field of artificial intelligence,
enabling natural language processing tasks that were previously thought to be exclusive to …

Efficient and effective text encoding for chinese llama and alpaca

Y Cui, Z Yang, X Yao - arXiv preprint arXiv:2304.08177, 2023 - arxiv.org
Large Language Models (LLMs), such as ChatGPT and GPT-4, have dramatically
transformed natural language processing research and shown promising strides towards …

Llemma: An open language model for mathematics

Z Azerbayev, H Schoelkopf, K Paster… - arXiv preprint arXiv …, 2023 - arxiv.org
We present Llemma, a large language model for mathematics. We continue pretraining
Code Llama on the Proof-Pile-2, a mixture of scientific papers, web data containing …

Advancing transformer architecture in long-context large language models: A comprehensive survey

Y Huang, J Xu, Z Jiang, J Lai, Z Li, Y Yao… - arXiv preprint arXiv …, 2023 - arxiv.org
With the bomb ignited by ChatGPT, Transformer-based Large Language Models (LLMs)
have paved a revolutionary path toward Artificial General Intelligence (AGI) and have been …

Openagents: An open platform for language agents in the wild

T Xie, F Zhou, Z Cheng, P Shi, L Weng, Y Liu… - arXiv preprint arXiv …, 2023 - arxiv.org
Language agents show potential in being capable of utilizing natural language for varied
and intricate tasks in diverse environments, particularly when built upon large language …

A simple recipe for contrastively pre-training video-first encoders beyond 16 frames

P Papalampidi, S Koppula, S Pathak… - Proceedings of the …, 2024 - openaccess.thecvf.com
Understanding long real-world videos requires modeling of long-range visual
dependencies. To this end we explore video-first architectures building on the common …

Eschernet: A generative model for scalable view synthesis

X Kong, S Liu, X Lyu, M Taher, X Qi… - Proceedings of the …, 2024 - openaccess.thecvf.com
We introduce EscherNet a multi-view conditioned diffusion model for view synthesis.
EscherNet learns implicit and generative 3D representations coupled with a specialised …