Few-shot low-resource knowledge graph completion with multi-view task representation generation

S Pei, Z Kou, Q Zhang, X Zhang - Proceedings of the 29th ACM SIGKDD …, 2023 - dl.acm.org
Despite their capacity to convey knowledge, most existing knowledge graphs (KGs) are
created for specific domains using low-resource data sources, especially those in non …

Dynamic data-free knowledge distillation by easy-to-hard learning strategy

J Li, S Zhou, L Li, H Wang, J Bu, Z Yu - Information Sciences, 2023 - Elsevier
Data-free knowledge distillation (DFKD) is a widely-used strategy for Knowledge Distillation
(KD) whose training data is not available. It trains a lightweight student model with the aid of …

Towards the fundamental limits of knowledge transfer over finite domains

Q Zhao, B Zhu - arXiv preprint arXiv:2310.07838, 2023 - arxiv.org
We characterize the statistical efficiency of knowledge transfer through $ n $ samples from a
teacher to a probabilistic student classifier with input space $\mathcal S $ over labels …

HDKD: Hybrid data-efficient knowledge distillation network for medical image classification

OS EL-Assiouti, G Hamed, D Khattab… - Engineering Applications of …, 2024 - Elsevier
Abstract Vision Transformers (ViTs) have achieved significant advancement in computer
vision tasks due to their powerful modeling capacity. However, their performance notably …

Advancing Few-Shot Black-Box Attack With Alternating Training

L Meng, M Shao, F Wang, Y Qiao… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Convolutional neural networks (CNNs) are known to be vulnerable to adversarial examples
even in black-box scenarios, posing a significant threat to their reliability and security. Most …

Mentored Learning: Improving Generalization and Convergence of Student Learner

X Cao, Y Guo, HT Shen, IW Tsang, JT Kwok - Journal of Machine Learning …, 2024 - jmlr.org
Student learners typically engage in an iterative process of actively updating its hypotheses,
like active learning. While this behavior can be advantageous, there is an inherent risk of …

Why does Knowledge Distillation work? Rethink its attention and fidelity mechanism

C Guo, S Zhong, X Liu, Q Feng, Y Ma - Expert Systems with Applications, 2025 - Elsevier
Abstract Does Knowledge Distillation (KD) really work? Conventional wisdom viewed it as a
knowledge transfer procedure where a perfect mimicry of the student to its teacher is …

Comparative Knowledge Distillation

A Wilf, AT Xu, PP Liang, A Obolenskiy, D Fried… - arXiv preprint arXiv …, 2023 - arxiv.org
In the era of large scale pretrained models, Knowledge Distillation (KD) serves an important
role in transferring the wisdom of computationally heavy teacher models to lightweight …

Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision Transformers

DN Grigore, MI Georgescu, JA Justo… - arXiv preprint arXiv …, 2024 - arxiv.org
Few-shot knowledge distillation recently emerged as a viable approach to harness the
knowledge of large-scale pre-trained models, using limited data and computational …

BSPA: Exploring Black-box Stealthy Prompt Attacks against Image Generators

Y Tian, X Yang, Y Dong, H Yang, H Su… - arXiv preprint arXiv …, 2024 - arxiv.org
Extremely large image generators offer significant transformative potential across diverse
sectors. It allows users to design specific prompts to generate realistic images through some …