Knowledge distillation from a stronger teacher

T Huang, S You, F Wang, C Qian… - Advances in Neural …, 2022 - proceedings.neurips.cc
Unlike existing knowledge distillation methods focus on the baseline settings, where the
teacher models and training strategies are not that strong and competing as state-of-the-art …

Knowledge Distillation from A Stronger Teacher

T Huang, S You, F Wang, C Qian, C Xu - arXiv preprint arXiv:2205.10536, 2022 - arxiv.org
Unlike existing knowledge distillation methods focus on the baseline settings, where the
teacher models and training strategies are not that strong and competing as state-of-the-art …

Knowledge distillation from a stronger teacher

T Huang, S You, F Wang, C Qian, C Xu - Proceedings of the 36th …, 2022 - dl.acm.org
Unlike existing knowledge distillation methods focus on the baseline settings, where the
teacher models and training strategies are not that strong and competing as state-of-the-art …

Knowledge Distillation from A Stronger Teacher

T Huang, S You, F Wang, C Qian… - Advances in Neural …, 2022 - openreview.net
Unlike existing knowledge distillation methods focus on the baseline settings, where the
teacher models and training strategies are not that strong and competing as state-of-the-art …

[PDF][PDF] Knowledge Distillation from A Stronger Teacher

T Huang, S You, F Wang, C Qian, C Xu - taohuang.info
In our paper (DIST):• We unify teacher with larger capacity and teacher with stronger training
strategy into one topic: stronger teacher, as they both change the output distribution of …

Knowledge Distillation from A Stronger Teacher

T Huang, S You, F Wang, C Qian, C Xu - arXiv e-prints, 2022 - ui.adsabs.harvard.edu
Unlike existing knowledge distillation methods focus on the baseline settings, where the
teacher models and training strategies are not that strong and competing as state-of-the-art …