RULER: What's the Real Context Size of Your Long-Context Language Models?

CP Hsieh, S Sun, S Kriman, S Acharya… - arXiv preprint arXiv …, 2024 - arxiv.org
The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of
information (the" needle") from long distractor texts (the" haystack"), has been widely …

On leakage of code generation evaluation datasets

A Matton, T Sherborne, D Aumiller… - arXiv preprint arXiv …, 2024 - arxiv.org
In this paper, we consider contamination by code generation test sets, in particular in their
use in modern large language models. We discuss three possible sources of such …

Context Conquers Parameters: Outperforming Proprietary LLM in Commit Message Generation

A Imani, I Ahmed, M Moshirpour - arXiv preprint arXiv:2408.02502, 2024 - arxiv.org
Commit messages provide descriptions of the modifications made in a commit using natural
language, making them crucial for software maintenance and evolution. Recent …

Scaling Granite Code Models to 128K Context

M Stallone, V Saxena, L Karlinsky, B McGinn… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper introduces long-context Granite code models that support effective context
windows of up to 128K tokens. Our solution for scaling context length of Granite 3B/8B code …

Repository Structure-Aware Training Makes SLMs Better Issue Resolver

Z Ma, S An, Z Lin, Y Zou, B Xie - arXiv preprint arXiv:2412.19031, 2024 - arxiv.org
Language models have been applied to various software development tasks, but the
performance varies according to the scale of the models. Large Language Models (LLMs) …

Retrieval Or Holistic Understanding? Dolce: Differentiate Our Long Context Evaluation Tasks

Z Yang - arXiv preprint arXiv:2409.06338, 2024 - arxiv.org
We argue that there are two major distinct capabilities in long context understanding:
retrieval and holistic understanding. Understanding and further improving LLMs' long …

[PDF][PDF] SharedContextBench: Evaluating Long-Context Methods in KV Cache Reuse

Y Li, H Jiang, Q Wu, X Luo, S Ahn, C Zhang, AH Abdi… - neurips2024-enlsp.github.io
Abstract Long-context Large Language Models (LLMs) have unlocked numerous
possibilities for downstream applications, many of which involve multiple requests sharing …