RepoQA: Evaluating Long Context Code Understanding

CP Hsieh, S Sun, S Kriman, S Acharya… - arXiv preprint arXiv …, 2024 - arxiv.org

The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of
information (the" needle") from long distractor texts (the" haystack"), has been widely …

被引用次数：92 相关文章所有 2 个版本

[PDF] arxiv.org

On leakage of code generation evaluation datasets

A Matton, T Sherborne, D Aumiller… - arXiv preprint arXiv …, 2024 - arxiv.org

In this paper, we consider contamination by code generation test sets, in particular in their
use in modern large language models. We discuss three possible sources of such …

被引用次数：13 相关文章所有 5 个版本

[PDF] arxiv.org

Context Conquers Parameters: Outperforming Proprietary LLM in Commit Message Generation

A Imani, I Ahmed, M Moshirpour - arXiv preprint arXiv:2408.02502, 2024 - arxiv.org

Commit messages provide descriptions of the modifications made in a commit using natural
language, making them crucial for software maintenance and evolution. Recent …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Scaling Granite Code Models to 128K Context

M Stallone, V Saxena, L Karlinsky, B McGinn… - arXiv preprint arXiv …, 2024 - arxiv.org

This paper introduces long-context Granite code models that support effective context
windows of up to 128K tokens. Our solution for scaling context length of Granite 3B/8B code …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

高级搜索

QQ 群