How much are llms contaminated? a comprehensive survey and the llmsanitize library

M Ravaut, B Ding, F Jiao, H Chen, X Li, R Zhao… - arXiv preprint arXiv …, 2024 - arxiv.org
With the rise of Large Language Models (LLMs) in recent years, new opportunities are
emerging, but also new challenges, and contamination is quickly becoming critical …

Data contamination report from the 2024 CONDA shared task

O Sainz, I García-Ferrero, A Jacovi, JA Campos… - arXiv preprint arXiv …, 2024 - arxiv.org
The 1st Workshop on Data Contamination (CONDA 2024) focuses on all relevant aspects of
data contamination in natural language processing, where data contamination is understood …

Benchmark Data Contamination of Large Language Models: A Survey

C Xu, S Guan, D Greene, M Kechadi - arXiv preprint arXiv:2406.04244, 2024 - arxiv.org
The rapid development of Large Language Models (LLMs) like GPT-4, Claude-3, and
Gemini has transformed the field of natural language processing. However, it has also …

Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions

Y Fu, O Uzuner, M Yetisgen, F Xia - arXiv preprint arXiv:2410.18966, 2024 - arxiv.org
Large language models (LLMs) have demonstrated great performance across various
benchmarks, showing potential as general-purpose task solvers. However, as LLMs are …

Confounders in instance variation for the analysis of data contamination

B Mehrbakhsh, D Garigliotti… - Proceedings of the …, 2024 - aclanthology.org
Test contamination is a serious problem for the evaluation of large language models (LLMs)
because it leads to the overestimation of their performance and a quick saturation of …

[PDF][PDF] Termite Italian Text-to-SQL: A CALAMITA Challenge

F Ranaldi, ES Ruzzetti, D Onorati… - Proceedings of the 10th …, 2024 - ceur-ws.org
Relational databases play an important role in business, science, and beyond. However, the
operability of relational databases is restricted to users familiar with specific languages such …

[PDF][PDF] IoT and NFT-based Asset Management System in Railway Maintenance

C Caramello, A Cigliano, F Fallucchi… - SYSYEM 2024: 10th …, 2024 - ceur-ws.org
The railway industry is a sector where asset maintenance is paramount in ensuring
passenger safety and service continuity. In this context, the application of the blockchain …

[PDF][PDF] The Impact of Digital Analysis and Large Language Models in Digital Humanity

A Cigliano, F Fallucchi, M Gerardi - … of Yearly Reports on Infor-matics …, 2024 - ceur-ws.org
The advent of digital analysis tools and Large Language Models (LLMs) has significantly
altered the landscape of digital humanities, introducing new methodologies for processing …

[PDF][PDF] The limits of Italian in Reasoning Tasks

L Ranaldi, F Ranaldi, G Pucci, ES Ruzzetti… - 2024 - ceur-ws.org
Earlier works have been showing the efficacy of reasoning methods in eliciting step-wise
reasoning of large language models (LLMs) by operating via in-context demonstrations …

[PDF][PDF] How far does the sequence of compositions impact Multilingual Pre-Training?

L Ranaldi, G Pucci, FM Zanzotto - 2024 - ceur-ws.org
An Efficient strategy for conducting pre-training of language models is the concatenation of
contiguous sequences of text of fixed length through causal masking that estimates the …