Data-free model extraction

[HTML][HTML] A survey on large language model (llm) security and privacy: The good, the bad, and the ugly

Y Yao, J Duan, K Xu, Y Cai, Z Sun, Y Zhang - High-Confidence Computing, 2024 - Elsevier

Abstract Large Language Models (LLMs), such as ChatGPT and Bard, have revolutionized
natural language understanding and generation. They possess deep language …

被引用次数：458 相关文章所有 11 个版本

[PDF] acm.org

I know what you trained last summer: A survey on stealing machine learning models and defences

D Oliynyk, R Mayer, A Rauber - ACM Computing Surveys, 2023 - dl.acm.org

Machine-Learning-as-a-Service (MLaaS) has become a widespread paradigm, making
even the most complex Machine Learning models available for clients via, eg, a pay-per …

被引用次数：118 相关文章所有 7 个版本

[PDF] arxiv.org

Trustworthy LLMs: A survey and guideline for evaluating large language models' alignment

Y Liu, Y Yao, JF Ton, X Zhang, RGH Cheng… - arXiv preprint arXiv …, 2023 - arxiv.org

Ensuring alignment, which refers to making models behave in accordance with human
intentions [1, 2], has become a critical task before deploying large language models (LLMs) …

被引用次数：262 相关文章所有 3 个版本

[PDF] arxiv.org

On protecting the data privacy of large language models (llms): A survey

B Yan, K Li, M Xu, Y Dong, Y Zhang, Z Ren… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) are complex artificial intelligence systems capable of
understanding, generating and translating human language. They learn language patterns …

被引用次数：62 相关文章所有 4 个版本

[PDF] thecvf.com

Bppattack: Stealthy and efficient trojan attacks against deep neural networks via image quantization and contrastive adversarial learning

Z Wang, J Zhai, S Ma - … of the IEEE/CVF Conference on …, 2022 - openaccess.thecvf.com

Deep neural networks are vulnerable to Trojan attacks. Existing attacks use visible patterns
(eg, a patch or image transformations) as triggers, which are vulnerable to human …

被引用次数：107 相关文章所有 7 个版本

[PDF] thecvf.com

Towards data-free model stealing in a hard label setting

S Sanyal, S Addepalli, RV Babu - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Abstract Machine learning models deployed as a service (MLaaS) are susceptible to model
stealing attacks, where an adversary attempts to steal the model within a restricted access …

被引用次数：105 相关文章所有 8 个版本

[PDF] thecvf.com

Towards efficient data free black-box adversarial attack

J Zhang, B Li, J Xu, S Wu, S Ding… - Proceedings of the …, 2022 - openaccess.thecvf.com

Classic black-box adversarial attacks can take advantage of transferable adversarial
examples generated by a similar substitute model to successfully fool the target model …

被引用次数：73 相关文章所有 3 个版本

[PDF] arxiv.org

Lion: Adversarial distillation of proprietary large language models

Y Jiang, C Chan, M Chen, W Wang - arXiv preprint arXiv:2305.12870, 2023 - arxiv.org

The practice of transferring knowledge from a sophisticated, proprietary large language
model (LLM) to a compact, open-source LLM has garnered considerable attention. Previous …

被引用次数：79 相关文章所有 4 个版本

[PDF] springer.com

Explainable artificial intelligence for cybersecurity: a literature survey

F Charmet, HC Tanuwidjaja, S Ayoubi… - Annals of …, 2022 - Springer

With the extensive application of deep learning (DL) algorithms in recent years, eg, for
detecting Android malware or vulnerable source code, artificial intelligence (AI) and …

被引用次数：73 相关文章所有 8 个版本

[PDF] thecvf.com

Learning to retain while acquiring: Combating distribution-shift in adversarial data-free knowledge distillation

G Patel, KR Mopuri, Q Qiu - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com

Abstract Data-free Knowledge Distillation (DFKD) has gained popularity recently, with the
fundamental idea of carrying out knowledge transfer from a Teacher neural network to a …

被引用次数：27 相关文章所有 7 个版本

高级搜索

QQ 群

[HTML][HTML] A survey on large language model (llm) security and privacy: The good, the bad, and the ugly

I know what you trained last summer: A survey on stealing machine learning models and defences

Trustworthy LLMs: A survey and guideline for evaluating large language models' alignment

On protecting the data privacy of large language models (llms): A survey

Bppattack: Stealthy and efficient trojan attacks against deep neural networks via image quantization and contrastive adversarial learning

Towards data-free model stealing in a hard label setting

Towards efficient data free black-box adversarial attack

Lion: Adversarial distillation of proprietary large language models

Explainable artificial intelligence for cybersecurity: a literature survey

Learning to retain while acquiring: Combating distribution-shift in adversarial data-free knowledge distillation

引用