Don't make your llm an evaluation benchmark cheater

K Zhou, Y Zhu, Z Chen, W Chen, WX Zhao… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models~(LLMs) have greatly advanced the frontiers of artificial intelligence,
attaining remarkable improvement in model capacity. To assess the model performance, a …

An analysis of large language models: their impact and potential applications

G Bharathi Mohan, R Prasanna Kumar… - … and Information Systems, 2024 - Springer
Large language models (LLMs) have transformed the interpretation and creation of human
language in the rapidly developing field of computerized language processing. These …

How much are llms contaminated? a comprehensive survey and the llmsanitize library

M Ravaut, B Ding, F Jiao, H Chen, X Li, R Zhao… - arXiv preprint arXiv …, 2024 - arxiv.org
With the rise of Large Language Models (LLMs) in recent years, new opportunities are
emerging, but also new challenges, and contamination is quickly becoming critical …

Rethinking machine unlearning for large language models

S Liu, Y Yao, J Jia, S Casper, N Baracaldo… - arXiv preprint arXiv …, 2024 - arxiv.org
We explore machine unlearning (MU) in the domain of large language models (LLMs),
referred to as LLM unlearning. This initiative aims to eliminate undesirable data influence …

Did the neurons read your book? document-level membership inference for large language models

M Meeus, S Jain, M Rei, YA de Montjoye - 33rd USENIX Security …, 2024 - usenix.org
With large language models (LLMs) poised to become embedded in our daily lives,
questions are starting to be raised about the data they learned from. These questions range …

Black-box access is insufficient for rigorous ai audits

S Casper, C Ezell, C Siegmann, N Kolt… - The 2024 ACM …, 2024 - dl.acm.org
External audits of AI systems are increasingly recognized as a key mechanism for AI
governance. The effectiveness of an audit, however, depends on the degree of access …

Foundational challenges in assuring alignment and safety of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arXiv preprint arXiv …, 2024 - arxiv.org
This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling

D Kim, C Park, S Kim, W Lee, W Song, Y Kim… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce SOLAR 10.7 B, a large language model (LLM) with 10.7 billion parameters,
demonstrating superior performance in various natural language processing (NLP) tasks …

Chatgpt's one-year anniversary: are open-source large language models catching up?

H Chen, F Jiao, X Li, C Qin, M Ravaut, R Zhao… - arXiv preprint arXiv …, 2023 - arxiv.org
Upon its release in late 2022, ChatGPT has brought a seismic shift in the entire landscape of
AI, both in research and commerce. Through instruction-tuning a large language model …

Raft: Adapting language model to domain specific rag

T Zhang, SG Patil, N Jain, S Shen, M Zaharia… - arXiv preprint arXiv …, 2024 - arxiv.org
Pretraining Large Language Models (LLMs) on large corpora of textual data is now a
standard paradigm. When using these LLMs for many downstream applications, it is …