A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Fit: Flexible vision transformer for diffusion model

Z Lu, Z Wang, D Huang, C Wu, X Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Nature is infinitely resolution-free. In the context of this reality, existing diffusion models, such
as Diffusion Transformers, often face challenges when processing image resolutions outside …

Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent Communities

T Ju, Y Wang, X Ma, P Cheng, H Zhao, Y Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid adoption of large language models (LLMs) in multi-agent systems has highlighted
their impressive capabilities in various applications, such as collaborative problem-solving …

D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models

H Que, J Liu, G Zhang, C Zhang, X Qu, Y Ma… - arXiv preprint arXiv …, 2024 - arxiv.org
Continual Pre-Training (CPT) on Large Language Models (LLMs) has been widely used to
expand the model's fundamental understanding of specific downstream domains (eg, math …

Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation

Z He, G Feng, S Luo, K Yang, D He, J Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
In this work, we leverage the intrinsic segmentation of language sequences and design a
new positional encoding method called Bilevel Positional Encoding (BiPE). For each …

Context Length Extension via Generalized Extrapolation Scale

L Li, Z Huaping - Findings of the Association for Computational …, 2024 - aclanthology.org
Context length expansion of transformer models is considered a key challenge, especially
when handling context beyond the training length during inference stage. In this paper, we …

Institutional Platform for Secure Self-Service Large Language Model Exploration

VK Bumgardner, MA Klusty, WV Logan… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper introduces a user-friendly platform developed by the University of Kentucky
Center for Applied AI, designed to make large, customized language models (LLMs) more …