A survey of text watermarking in the era of large language models

A Liu, L Pan, Y Lu, J Li, X Hu, X Zhang, L Wen… - ACM Computing …, 2024 - dl.acm.org
Text watermarking algorithms are crucial for protecting the copyright of textual content.
Historically, their capabilities and application scenarios were limited. However, recent …

Undetectable watermarks for language models

M Christ, S Gunn, O Zamir - The Thirty Seventh Annual …, 2024 - proceedings.mlr.press
Recent advances in the capabilities of large language models such as GPT-4 have spurred
increasing concern about our ability to detect AI-generated text. Prior works have suggested …

Unbiased watermark for large language models

Z Hu, L Chen, X Wu, Y Wu, H Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
The recent advancements in large language models (LLMs) have sparked a growing
apprehension regarding the potential misuse. One approach to mitigating this risk is to …

{REMARK-LLM}: A robust and efficient watermarking framework for generative large language models

R Zhang, SS Hussain, P Neekhara… - 33rd USENIX Security …, 2024 - usenix.org
We present REMARK-LLM, a novel efficient, and robust watermarking framework designed
for texts generated by large language models (LLMs). Synthesizing human-like content …

Publicly detectable watermarking for language models

J Fairoze, S Garg, S Jha, S Mahloujifar… - arXiv preprint arXiv …, 2023 - arxiv.org
We construct the first provable watermarking scheme for language models with public
detectability or verifiability: we use a private key for watermarking and a public key for …

Advancing beyond identification: Multi-bit watermark for language models

KY Yoo, W Ahn, N Kwak - arXiv preprint arXiv:2308.00221, 2023 - arxiv.org
This study aims to proactively tackle misuse of large language models beyond identification
of machine-generated text. While existing methods focus on detection, some malicious …

Dipmark: A stealthy, efficient and resilient watermark for large language models

Y Wu, Z Hu, H Zhang, H Huang - arXiv preprint arXiv:2310.07710, 2023 - arxiv.org
Watermarking techniques offer a promising way to secure data via embedding covert
information into the data. A paramount challenge in the domain lies in preserving the …

Howkgpt: Investigating the detection of chatgpt-generated university student homework through context-aware perplexity analysis

C Vasilatos, M Alam, T Rahwan, Y Zaki… - arXiv preprint arXiv …, 2023 - arxiv.org
As the use of Large Language Models (LLMs) in text generation tasks proliferates, concerns
arise over their potential to compromise academic integrity. The education sector currently …

AuthentiGPT: Detecting machine-generated text via black-box language models denoising

Z Guo, S Yu - arXiv preprint arXiv:2311.07700, 2023 - arxiv.org
Large language models (LLMs) have opened up enormous opportunities while
simultaneously posing ethical dilemmas. One of the major concerns is their ability to create …

Advancing beyond identification: Multi-bit watermark for large language models

KY Yoo, W Ahn, N Kwak - Proceedings of the 2024 Conference of …, 2024 - aclanthology.org
We show the viability of tackling misuses of large language models beyond the identification
of machine-generated text. While existing zero-bit watermark methods focus on detection …