Shortgpt: Layers in large language models are more redundant than you expect

X Men, M Xu, Q Zhang, B Wang, H Lin, Y Lu… - arXiv preprint arXiv …, 2024 - arxiv.org
As Large Language Models (LLMs) continue to advance in performance, their size has
escalated significantly, with current LLMs containing billions or even trillions of parameters …

Exploring Learngene via Stage-wise Weight Sharing for Initializing Variable-sized Models

SY Xia, W Zhu, X Yang, X Geng - arXiv preprint arXiv:2404.16897, 2024 - arxiv.org
In practice, we usually need to build variable-sized models adapting for diverse resource
constraints in different application scenarios, where weight initialization is an important step …

On Speculative Decoding for Multimodal Large Language Models

M Gagrani, R Goel, W Jeon, J Park, M Lee… - arXiv preprint arXiv …, 2024 - arxiv.org
Inference with Multimodal Large Language Models (MLLMs) is slow due to their large-
language-model backbone which suffers from memory bandwidth bottleneck and generates …

Direct Alignment of Draft Model for Speculative Decoding with Chat-Fine-Tuned LLMs

R Goel, M Gagrani, W Jeon, J Park, M Lee… - arXiv preprint arXiv …, 2024 - arxiv.org
Text generation with Large Language Models (LLMs) is known to be memory bound due to
the combination of their auto-regressive nature, huge parameter counts, and limited memory …

A deeper look at depth pruning of LLMs

SA Siddiqui, X Dong, G Heinrich, T Breuel… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) are not only resource-intensive to train but even more
costly to deploy in production. Therefore, recent work has attempted to prune blocks of LLMs …

BlockPruner: Fine-grained Pruning for Large Language Models

L Zhong, F Wan, R Chen, X Quan, L Li - arXiv preprint arXiv:2406.10594, 2024 - arxiv.org
With the rapid growth in the size and complexity of large language models (LLMs), the costs
associated with their training and inference have escalated significantly. Research indicates …

Research on a Flower Recognition Method Based on Masked Autoencoders

Y Li, Y Lv, Y Ding, H Zhu, H Gao, L Zheng - Horticulturae, 2024 - mdpi.com
Accurate and efficient flower identification holds significant importance not only for the
general public—who may use this information for educational, recreational, or conservation …