A survey on large language models: Applications, challenges, limitations, and practical usage

MU Hadi, R Qureshi, A Shah, M Irfan, A Zafar… - Authorea …, 2023 - techrxiv.org
Within the vast expanse of computerized language processing, a revolutionary entity known
as Large Language Models (LLMs) has emerged, wielding immense power in its capacity to …

Craft: Customizing llms by creating and retrieving from specialized toolsets

L Yuan, Y Chen, X Wang, YR Fung, H Peng… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) are often augmented with tools to solve complex tasks. By
generating code snippets and executing them through task-specific Application …

Cobra: Extending mamba to multi-modal large language model for efficient inference

H Zhao, M Zhang, W Zhao, P Ding, S Huang… - arXiv preprint arXiv …, 2024 - arxiv.org
In recent years, the application of multimodal large language models (MLLM) in various
fields has achieved remarkable success. However, as the foundation model for many …

Macaw-llm: Multi-modal language modeling with image, audio, video, and text integration

C Lyu, M Wu, L Wang, X Huang, B Liu, Z Du… - arXiv preprint arXiv …, 2023 - arxiv.org
Although instruction-tuned large language models (LLMs) have exhibited remarkable
capabilities across various NLP tasks, their effectiveness on other data modalities beyond …

Mllm-bench, evaluating multi-modal llms using gpt-4v

W Ge, S Chen, G Chen, J Chen, Z Chen, S Yan… - arXiv preprint arXiv …, 2023 - arxiv.org
In the pursuit of Artificial General Intelligence (AGI), the integration of vision in language
models has marked a significant milestone. The advent of vision-language models (MLLMs) …

Sphinx-x: Scaling data and parameters for a family of multi-modal large language models

P Gao, R Zhang, C Liu, L Qiu, S Huang, W Lin… - arXiv preprint arXiv …, 2024 - arxiv.org
We propose SPHINX-X, an extensive Multimodality Large Language Model (MLLM) series
developed upon SPHINX. To improve the architecture and training efficiency, we modify the …

The (r) evolution of multimodal large language models: A survey

D Caffagni, F Cocchi, L Barsellotti, N Moratelli… - arXiv preprint arXiv …, 2024 - arxiv.org
Connecting text and visual modalities plays an essential role in generative intelligence. For
this reason, inspired by the success of large language models, significant research efforts …

Advancing transformer architecture in long-context large language models: A comprehensive survey

Y Huang, J Xu, Z Jiang, J Lai, Z Li, Y Yao… - arXiv preprint arXiv …, 2023 - arxiv.org
With the bomb ignited by ChatGPT, Transformer-based Large Language Models (LLMs)
have paved a revolutionary path toward Artificial General Intelligence (AGI) and have been …

Benchmarking large language model capabilities for conditional generation

J Maynez, P Agrawal, S Gehrmann - arXiv preprint arXiv:2306.16793, 2023 - arxiv.org
Pre-trained large language models (PLMs) underlie most new developments in natural
language processing. They have shifted the field from application-specific model pipelines …

Rethinking mobile AI ecosystem in the LLM era

J Yuan, C Yang, D Cai, S Wang, X Yuan… - arXiv preprint arXiv …, 2023 - arxiv.org
In today's landscape, smartphones have evolved into hubs for hosting a multitude of deep
learning models aimed at local execution. A key realization driving this work is the notable …