Can Copyright be Reduced to Privacy?

N Elkin-Koren, U Hacohen, R Livni, S Moran - arXiv preprint arXiv …, 2023 - arxiv.org
There is a growing concern that generative AI models will generate outputs closely
resembling the copyrighted materials for which they are trained. This worry has intensified …

Securing Large Language Models: Threats, Vulnerabilities and Responsible Practices

S Abdali, R Anarfi, CJ Barberan, J He - arXiv preprint arXiv:2403.12503, 2024 - arxiv.org
Large language models (LLMs) have significantly transformed the landscape of Natural
Language Processing (NLP). Their impact extends across a diverse spectrum of tasks …

The curious case of nonverbal abstract reasoning with multi-modal large language models

K Ahrabian, Z Sourati, K Sun, J Zhang, Y Jiang… - arXiv preprint arXiv …, 2024 - arxiv.org
While large language models (LLMs) are still being adopted to new domains and utilized in
novel applications, we are experiencing an influx of the new generation of foundation …

Recite, reconstruct, recollect: Memorization in LMs as a multifaceted phenomenon

US Prashanth, A Deng, K O'Brien, J SV… - arXiv preprint arXiv …, 2024 - arxiv.org
Memorization in language models is typically treated as a homogenous phenomenon,
neglecting the specifics of the memorized data. We instead model memorization as the effect …

Mothman at SemEval-2024 Task 9: An Iterative System for Chain-of-Thought Prompt Optimization

APC Chen, R Groshan, S Von Bayern - arXiv preprint arXiv:2405.02517, 2024 - arxiv.org
Extensive research exists on the performance of large language models on logic-based
tasks, whereas relatively little has been done on their ability to generate creative solutions …

Feedback processing in the primate brain and in AI systems

Y Jiang, S He - Science China Technological Sciences, 2024 - Springer
The primate brain and artificial intelligence (AI) can both be conceptualized as information
processing systems, each with its own distinct biological and computational architectures …

Unveiling the Spectrum of Data Contamination in Language Models: A Survey from Detection to Remediation

C Deng, Y Zhao, Y Heng, Y Li, J Cao, X Tang… - arXiv preprint arXiv …, 2024 - arxiv.org
Data contamination has garnered increased attention in the era of large language models
(LLMs) due to the reliance on extensive internet-derived training corpora. The issue of …

Replication in Visual Diffusion Models: A Survey and Outlook

W Wang, Y Sun, Z Yang, Z Hu, Z Tan… - arXiv preprint arXiv …, 2024 - arxiv.org
Visual diffusion models have revolutionized the field of creative AI, producing high-quality
and diverse content. However, they inevitably memorize training images or videos …

U Can't Gen This? A Survey of Intellectual Property Protection Methods for Data in Generative AI

T Šarčević, A Karlowicz, R Mayer… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Generative AI (GAI) models have the unparalleled ability to generate text, images,
audio, and other forms of media that are increasingly indistinguishable from human …

Localizing Paragraph Memorization in Language Models

N Stoehr, M Gordon, C Zhang, O Lewis - arXiv preprint arXiv:2403.19851, 2024 - arxiv.org
Can we localize the weights and mechanisms used by a language model to memorize and
recite entire paragraphs of its training data? In this paper, we show that while memorization …