[PDF][PDF] 大数据安全与隐私保护

冯登国, 张敏, 李昊 - 2014 - cjc.ict.ac.cn
摘要大数据(Big Data) 已成为学术界和产业界的研究热点, 正影响着人们日常生活,
工作习惯及思考方式. 但是目前大数据在收集, 存储和使用过程中面临着诸多安全风险 …

A watermark for large language models

J Kirchenbauer, J Geiping, Y Wen… - International …, 2023 - proceedings.mlr.press
Potential harms of large language models can be mitigated by watermarking model output,
ie, embedding signals into generated text that are invisible to humans but algorithmically …

Undetectable watermarks for language models

M Christ, S Gunn, O Zamir - The Thirty Seventh Annual …, 2024 - proceedings.mlr.press
Recent advances in the capabilities of large language models such as GPT-4 have spurred
increasing concern about our ability to detect AI-generated text. Prior works have suggested …

A review of text watermarking: theory, methods, and applications

NS Kamaruddin, A Kamsin, LY Por, H Rahman - IEEE Access, 2018 - ieeexplore.ieee.org
During the recent years, the issue of preserving the integrity of digital text has become a
focus of interest in the transmission of online content on the Internet. Watermarking has a …

On the dangers of stochastic parrots: Can language models be too big?🦜

EM Bender, T Gebru, A McMillan-Major… - Proceedings of the 2021 …, 2021 - dl.acm.org
The past 3 years of work in NLP have been characterized by the development and
deployment of ever larger language models, especially for English. BERT, its variants, GPT …

A survey of big data security and privacy preserving

W Fang, XZ Wen, Y Zheng, M Zhou - IETE Technical Review, 2017 - Taylor & Francis
Nowadays, big data has become ubiquitous. Big data contains great value and chance.
However, big data also brings many security risks and privacy-preserving problems. Security …

Provable robust watermarking for ai-generated text

X Zhao, P Ananth, L Li, YX Wang - arXiv preprint arXiv:2306.17439, 2023 - arxiv.org
As AI-generated text increasingly resembles human-written content, the ability to detect
machine-generated text becomes crucial. To address this challenge, we present …

Robust multi-bit natural language watermarking through invariant features

KY Yoo, W Ahn, J Jang, N Kwak - arXiv preprint arXiv:2305.01904, 2023 - arxiv.org
Recent years have witnessed a proliferation of valuable original natural language contents
found in subscription-based media outlets, web novel platforms, and outputs of large …

Watermarks in the sand: Impossibility of strong watermarking for generative models

H Zhang, BL Edelman, D Francati, D Venturi… - arXiv preprint arXiv …, 2023 - arxiv.org
Watermarking generative models consists of planting a statistical signal (watermark) in a
model's output so that it can be later verified that the output was generated by the given …

Who wrote this code? watermarking for code generation

T Lee, S Hong, J Ahn, I Hong, H Lee, S Yun… - arXiv preprint arXiv …, 2023 - arxiv.org
With the remarkable generation performance of large language models, ethical and legal
concerns about using them have been raised, such as plagiarism and copyright issues. For …