Similarity-preserving knowledge distillation

F Tung, G Mori - Proceedings of the IEEE/CVF international …, 2019 - openaccess.thecvf.com
… to compress large networks into more resource-efficient ones with minimal accuracy loss. In
… We presented in this paper a novel distillation loss for capturing and transferring knowledge

Knowledge distillation from a stronger teacher

T Huang, S You, F Wang, C Qian… - Advances in Neural …, 2022 - proceedings.neurips.cc
… misalignment between KD loss and classification loss would be severer, thus disturbing the
student’s training. As a result, the exact match (ie, the loss reaches the minimal if and only if …

Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation

T Zhang, M Xue, J Zhang, H Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
… generalizable solution by weighted averaging the local minimum located in the border of
areas with lower loss. Considering the relationship between the loss landscape’s geometry and …

Knowledge distillation: A good teacher is patient and consistent

L Beyer, X Zhai, A Royer, L Markeeva… - Proceedings of the …, 2022 - openaccess.thecvf.com
knowledge distillation approach which does not suffer from these drawbacks. The idea behind
knowledge distillation is to “distill” … We closely follow the original distillation setup from [12] …

Knowledge distillation via route constrained optimization

X Jin, B Peng, Y Wu, Y Liu, J Liu… - Proceedings of the …, 2019 - openaccess.thecvf.com
… Besides the visualization of optimization trajectory, we also observe that the new local
minimum has better generalization capacity and is more robust to random noise in input space. …

Parameter-efficient and student-friendly knowledge distillation

J Rao, X Meng, L Ding, S Qi, X Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
… adapter module [19] for knowledge distillation with updating parameters of this adapter …
knowledge distillation methods for comparison, including Vanilla [1], knowledge distillation

Online knowledge distillation via collaborative learning

Q Guo, X Wang, Y Wu, Z Yu, D Liang… - Proceedings of the …, 2020 - openaccess.thecvf.com
Loss function To improve the generalization performance, we distill the knowledge of soft …
Then a neat way to generate teacher logit is to select the minimum element of each row of …

Teach less, learn more: On the undistillable classes in knowledge distillation

Y Zhu, N Liu, Z Xu, X Liu, W Meng… - Advances in …, 2022 - proceedings.neurips.cc
… in knowledge distillation and a surge of related literature on designing better distillation
this problem, but few minimal efforts have been put into understanding this phenomenon. …

Distilling spikes: Knowledge distillation in spiking neural networks

RK Kushawaha, S Kumar, B Banerjee… - 2020 25th …, 2021 - ieeexplore.ieee.org
… In this work, we propose a Knowledge distillation that allows transferring the knowledge of
the large SNN to smaller one in a disciplined fashion with minimal loss in performance. We …

… for reducing arrhythmia classification from 12-lead ECG signals to single-lead ECG with minimal loss of accuracy through teacher-student knowledge distillation

M Sepahvand, F Abdali-Mohammadi - Information Sciences, 2022 - Elsevier
Knowledge distillation was utilized in this paper to propose a method for bridging the gap …
Despite its simplicity, the student model receives the dark knowledge of multi-lead ECG signals …