Unlimiformer: Long-range transformers with unlimited length input

A Bertsch, U Alon, G Neubig… - Advances in Neural …, 2024 - proceedings.neurips.cc
Since the proposal of transformers, these models have been limited to bounded input
lengths, because of their need to attend to every token in the input. In this work, we propose …

Sub-sentence encoder: Contrastive learning of propositional semantic representations

S Chen, H Zhang, T Chen, B Zhou, W Yu, D Yu… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce sub-sentence encoder, a contrastively-learned contextual embedding model
for fine-grained semantic representation of text. In contrast to the standard practice with …

Llavolta: Efficient multi-modal models via stage-wise visual context compression

J Chen, L Ye, J He, ZY Wang, D Khashabi… - arXiv preprint arXiv …, 2024 - arxiv.org
While significant advancements have been made in compressed representations for text
embeddings in large language models (LLMs), the compression of visual tokens in large …

Optimizing retrieval-augmented reader models via token elimination

M Berchansky, P Izsak, A Caciularu, I Dagan… - arXiv preprint arXiv …, 2023 - arxiv.org
Fusion-in-Decoder (FiD) is an effective retrieval-augmented language model applied across
a variety of open-domain tasks, such as question answering, fact checking, etc. In FiD …

CoLLEGe: Concept Embedding Generation for Large Language Models

R Teehan, B Lake, M Ren - arXiv preprint arXiv:2403.15362, 2024 - arxiv.org
Current language models are unable to quickly learn new concepts on the fly, often
requiring a more involved finetuning process to learn robustly. Prompting in-context is not …

Efficient large multi-modal models via visual context compression

J Chen, L Ye, J He, ZY Wang, D Khashabi… - The Thirty-eighth …, 2024 - openreview.net
While significant advancements have been made in compressed representations for text
embeddings in large language models (LLMs), the compression of visual tokens in multi …

When Text Embedding Meets Large Language Model: A Comprehensive Survey

Z Nie, Z Feng, M Li, C Zhang, Y Zhang, D Long… - arXiv preprint arXiv …, 2024 - arxiv.org
Text embedding has become a foundational technology in natural language processing
(NLP) during the deep learning era, driving advancements across a wide array of …

From Reading to Compressing: Exploring the Multi-document Reader for Prompt Compression

E Choi, S Lee, M Choi, J Park, J Lee - arXiv preprint arXiv:2410.04139, 2024 - arxiv.org
Large language models (LLMs) have achieved significant performance gains using
advanced prompting techniques over various tasks. However, the increasing length of …

Nugget 2D: Dynamic Contextual Compression for Scaling Decoder-only Language Models

G Qin, C Rosset, EC Chau, N Rao… - arXiv preprint arXiv …, 2023 - arxiv.org
Standard Transformer-based language models (LMs) scale poorly to long contexts. We
propose a solution based on dynamic contextual compression, which extends the Nugget …

Generation with Dynamic Vocabulary

Y Liu, T Ji, C Sun, Y Wu, X Wang - arXiv preprint arXiv:2410.08481, 2024 - arxiv.org
We introduce a new dynamic vocabulary for language models. It can involve arbitrary text
spans during generation. These text spans act as basic generation bricks, akin to tokens in …