Knowledge distillation and student-teacher learning for visual intelligence: A review and new outlooks

L Wang, KJ Yoon - IEEE transactions on pattern analysis and …, 2021 - ieeexplore.ieee.org
Deep neural models, in recent years, have been successful in almost every field, even
solving the most complex problem statements. However, these models are huge in size with …

Ensemble deep learning in bioinformatics

Y Cao, TA Geddes, JYH Yang, P Yang - Nature Machine Intelligence, 2020 - nature.com
The remarkable flexibility and adaptability of ensemble methods and deep learning models
have led to the proliferation of their application in bioinformatics research. Traditionally …

Knowledge distillation: A survey

J Gou, B Yu, SJ Maybank, D Tao - International Journal of Computer Vision, 2021 - Springer
In recent years, deep neural networks have been successful in both industry and academia,
especially for computer vision tasks. The great success of deep learning is mainly due to its …

[HTML][HTML] Comparison of CNN-based deep learning architectures for rice diseases classification

MT Ahad, Y Li, B Song, T Bhuiyan - Artificial Intelligence in Agriculture, 2023 - Elsevier
Although convolutional neural network (CNN) paradigms have expanded to transfer
learning and ensemble models from original individual CNN architectures, few studies have …

Towards understanding ensemble, knowledge distillation and self-distillation in deep learning

Z Allen-Zhu, Y Li - arXiv preprint arXiv:2012.09816, 2020 - arxiv.org
We formally study how ensemble of deep learning models can improve test accuracy, and
how the superior performance of ensemble can be distilled into a single model using …

Group knowledge transfer: Federated learning of large cnns at the edge

C He, M Annavaram… - Advances in Neural …, 2020 - proceedings.neurips.cc
Scaling up the convolutional neural network (CNN) size (eg, width, depth, etc.) is known to
effectively improve model accuracy. However, the large model size impedes training on …

Contrastive representation distillation

Y Tian, D Krishnan, P Isola - arXiv preprint arXiv:1910.10699, 2019 - arxiv.org
Often we wish to transfer representational knowledge from one neural network to another.
Examples include distilling a large network into a smaller one, transferring knowledge from …

Dive into ambiguity: Latent distribution mining and pairwise uncertainty estimation for facial expression recognition

J She, Y Hu, H Shi, J Wang… - Proceedings of the …, 2021 - openaccess.thecvf.com
Due to the subjective annotation and the inherent inter-class similarity of facial expressions,
one of key challenges in Facial Expression Recognition (FER) is the annotation ambiguity …

Similarity-preserving knowledge distillation

F Tung, G Mori - Proceedings of the IEEE/CVF international …, 2019 - openaccess.thecvf.com
Abstract Knowledge distillation is a widely applicable technique for training a student neural
network under the guidance of a trained teacher network. For example, in neural network …

On the efficacy of knowledge distillation

JH Cho, B Hariharan - Proceedings of the IEEE/CVF …, 2019 - openaccess.thecvf.com
In this paper, we present a thorough evaluation of the efficacy of knowledge distillation and
its dependence on student and teacher architectures. Starting with the observation that more …