Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks

C Wang, H Duan, S Zhang, D Lin, K Chen - arXiv preprint arXiv …, 2024 - arxiv.org
Recently, the large language model (LLM) community has shown increasing interest in
enhancing LLMs' capability to handle extremely long documents. As various long-text …

Bamboo: A comprehensive benchmark for evaluating long text modeling capacities of large language models

Z Dong, T Tang, J Li, WX Zhao, JR Wen - arXiv preprint arXiv:2309.13345, 2023 - arxiv.org
Large language models (LLMs) have achieved dramatic proficiency over NLP tasks with
normal length. Recently, multiple studies have committed to extending the context length …

Longllmlingua: Accelerating and enhancing llms in long context scenarios via prompt compression

H Jiang, Q Wu, X Luo, D Li, CY Lin, Y Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
In long context scenarios, large language models (LLMs) face three main challenges: higher
computational/financial cost, longer latency, and inferior performance. Some studies reveal …

Clongeval: A chinese benchmark for evaluating long-context large language models

Z Qiu, J Li, S Huang, W Zhong, I King - arXiv preprint arXiv:2403.03514, 2024 - arxiv.org
Developing Large Language Models (LLMs) with robust long-context capabilities has been
the recent research focus, resulting in the emergence of long-context LLMs proficient in …

LooGLE: Can Long-Context Language Models Understand Long Contexts?

J Li, M Wang, Z Zheng, M Zhang - arXiv preprint arXiv:2311.04939, 2023 - arxiv.org
Large language models (LLMs), despite their impressive performance in various language
tasks, are typically limited to processing texts within context-window size. This limitation has …

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Y Ding, LL Zhang, C Zhang, Y Xu, N Shang… - arXiv preprint arXiv …, 2024 - arxiv.org
Large context window is a desirable feature in large language models (LLMs). However,
due to high fine-tuning costs, scarcity of long texts, and catastrophic values introduced by …

ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models

T Thonet, J Rozen, L Besacier - arXiv preprint arXiv:2403.20262, 2024 - arxiv.org
Research on Large Language Models (LLMs) has recently witnessed an increasing interest
in extending models' context size to better capture dependencies within long documents …

E^ 2-LLM: Efficient and Extreme Length Extension of Large Language Models

J Liu, Z Bai, Y Zhang, C Zhang, Y Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Typically, training LLMs with long context sizes is computationally expensive, requiring
extensive training hours and GPU resources. Existing long-context extension methods …

Counting-stars: A simple, efficient, and reasonable strategy for evaluating long-context large language models

M Song, M Zheng, X Luo - arXiv preprint arXiv:2403.11802, 2024 - arxiv.org
While recent research endeavors have concentrated on developing Large Language
Models (LLMs) with robust long-context capabilities, due to the lack of appropriate …

XLBench: A Benchmark for Extremely Long Context Understanding with Long-range Dependencies

X Ni, H Cai, X Wei, S Wang, D Yin, P Li - arXiv preprint arXiv:2404.05446, 2024 - arxiv.org
Large Language Models (LLMs) have demonstrated remarkable performance across
diverse tasks but are constrained by their small context window sizes. Various efforts have …