An Overview of Trustworthy AI: Advances in IP Protection, Privacy-preserving Federated Learning, Security Verification, and GAI Safety Alignment

Y Zheng, CH Chang, SH Huang… - IEEE Journal on …, 2024 - ieeexplore.ieee.org
AI has undergone a remarkable evolution journey marked by groundbreaking milestones.
Like any powerful tool, it can be turned into a weapon for devastation in the wrong hands …

Revisiting Black-box Ownership Verification for Graph Neural Networks

R Zhou, K Yang, X Wang, WH Wang… - 2024 IEEE Symposium on …, 2024 - computer.org
Abstract Graph Neural Networks (GNNs) have emerged as powerful tools for processing
graph-structured data, enabling applications in various domains. Yet, GNNs are vulnerable …

HoneypotNet: Backdoor Attacks Against Model Extraction

Y Wang, T Gu, Y Teng, Y Wang, X Ma - arXiv preprint arXiv:2501.01090, 2025 - arxiv.org
Model extraction attacks are one type of inference-time attacks that approximate the
functionality and performance of a black-box victim model by launching a certain number of …

Intellectual Property Protection for Deep Learning Model and Dataset Intelligence

Y Jiang, Y Gao, C Zhou, H Hu, A Fu… - arXiv preprint arXiv …, 2024 - arxiv.org
With the growing applications of Deep Learning (DL), especially recent spectacular
achievements of Large Language Models (LLMs) such as ChatGPT and LLaMA, the …

[PDF][PDF] Protecting object detection models from model extraction attack via feature space coverage

Z Li, Y Pu, X Zhang, Y Li, J Li, S Ji - Proceedings of the 33rd International …, 2024 - ijcai.org
The model extraction attack is an attack pattern aimed at stealing well-trained machine
learning models' functionality or privacy information. With the gradual popularization of AI …

AI Risk Management Should Incorporate Both Safety and Security

X Qi, Y Huang, Y Zeng, E Debenedetti… - arXiv preprint arXiv …, 2024 - arxiv.org
The exposure of security vulnerabilities in safety-aligned language models, eg, susceptibility
to adversarial attacks, has shed light on the intricate interplay between AI safety and AI …

VidModEx: Interpretable and Efficient Black Box Model Extraction for High-Dimensional Spaces

SS Kumar, Y Govindarajulu, P Kulkarni… - arXiv preprint arXiv …, 2024 - arxiv.org
In the domain of black-box model extraction, conventional methods reliant on soft labels or
surrogate datasets struggle with scaling to high-dimensional input spaces and managing the …

DualCOS: Query-Efficient Data-Free Model Stealing with Dual Clone Networks and Optimal Samples

Y Yang, X Chen, Y Xuan, Z Zhao - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Although data-free model stealing attacks are free from reliance on real data, they suffer
from limitations, including low accuracy and high query budgets, which restrict their practical …

IPES: Improved Pre-trained Encoder Stealing Attack in Contrastive Learning

C Zhang, Z Li, H Liang, J Liang, X Liu… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Recent studies have shed light on security vulnerabilities in Encoder-as-a-Service (EaaS)
systems that enable the theft of valuable encoder attributes such as functionality. However …

PtbStolen: Pre-trained Encoder Stealing Through Perturbed Samples

C Zhang, H Liang, Z Li, T Wu, L Wang, L Zhu - International Symposium on …, 2023 - Springer
PtbStolen: Pre-trained Encoder Stealing Through Perturbed Samples | SpringerLink Skip to
main content Advertisement SpringerLink Account Menu Find a journal Publish with us Track …