Logit standardization in knowledge distillation

S Sun, W Ren, J Li, R Wang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Knowledge distillation involves transferring soft labels from a teacher to a student
using a shared temperature-based softmax function. However the assumption of a shared …

Reciprocal Teacher-Student Learning via Forward and Feedback Knowledge Distillation

J Gou, Y Chen, B Yu, J Liu, L Du… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Knowledge distillation (KD) is a prevalent model compression technique in deep learning,
aiming to leverage knowledge from a large teacher model to enhance the training of a …

Efficient crowd counting via dual knowledge distillation

R Wang, Y Hao, L Hu, X Li, M Chen… - … on Image Processing, 2023 - ieeexplore.ieee.org
Most researchers focus on designing accurate crowd counting models with heavy
parameters and computations but ignore the resource burden during the model deployment …

Instance Temperature Knowledge Distillation

Z Zhang, Y Zhou, J Gong, J Liu, Z Tu - arXiv preprint arXiv:2407.00115, 2024 - arxiv.org
Knowledge distillation (KD) enhances the performance of a student network by allowing it to
learn the knowledge transferred from a teacher network incrementally. Existing methods …

Good Teachers Explain: Explanation-Enhanced Knowledge Distillation

A Parchami-Araghi, M Böhle, S Rao… - arXiv preprint arXiv …, 2024 - arxiv.org
Knowledge Distillation (KD) has proven effective for compressing large teacher models into
smaller student models. While it is well known that student models can achieve similar …

Expanding and Refining Hybrid Compressors for Efficient Object Re-identification

Y Xie, H Wu, J Zhu, H Zeng… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Recent object re-identification (Re-ID) methods gain high efficiency via lightweight student
models trained by knowledge distillation (KD). However, the huge architectural difference …

Attention and feature transfer based knowledge distillation

G Yang, S Yu, Y Sheng, H Yang - Scientific Reports, 2023 - nature.com
Existing knowledge distillation (KD) methods are mainly based on features, logic, or
attention, where features and logic represent the results of reasoning at different stages of a …

Maximizing discrimination capability of knowledge distillation with energy function

S Kim, G Ham, S Lee, D Jang, D Kim - Knowledge-Based Systems, 2024 - Elsevier
To apply the latest computer vision techniques that require a large computational cost in real
industrial applications, knowledge distillation methods (KDs) are essential. Existing logit …

Post-Distillation via Neural Resuscitation

Z Bao, Z Chen, C Wang, WS Zheng… - IEEE Transactions …, 2023 - ieeexplore.ieee.org
Knowledge distillation, a widely adopted model compression technique, distils knowledge
from a large teacher model to a smaller student model, with the goal of reducing the …

Few-shot Classification Model Compression via School Learning

S Yang, F Liu, D Chen, H Huang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Few-shot classification (FSC) is a challenging task due to limitation in accessing training
data. Recent methods often employ highly complex networks to obtain high-quality features …