One-for-all: Bridge the gap between heterogeneous architectures in knowledge distillation

Z Hao, J Guo, K Han, Y Tang, H Hu… - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract Knowledge distillation (KD) has proven to be a highly effective approach for
enhancing model performance through a teacher-student training scheme. However, most …

One-for-all: bridge the gap between heterogeneous architectures in knowledge distillation

Z Hao, J Guo, K Han, Y Tang, H Hu, Y Wang… - Proceedings of the 37th …, 2023 - dl.acm.org
Knowledge distillation (KD) has proven to be a highly effective approach for enhancing
model performance through a teacher-student training scheme. However, most existing …

One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation

Z Hao, J Guo, K Han, Y Tang, H Hu, Y Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Knowledge distillation~(KD) has proven to be a highly effective approach for enhancing
model performance through a teacher-student training scheme. However, most existing …

One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation

Z Hao, J Guo, K Han, Y Tang, H Hu, Y Wang… - arXiv e …, 2023 - ui.adsabs.harvard.edu
Abstract Knowledge distillation~(KD) has proven to be a highly effective approach for
enhancing model performance through a teacher-student training scheme. However, most …

One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation

Z Hao, J Guo, K Han, Y Tang, H Hu, Y Wang… - Thirty-seventh Conference … - openreview.net
Knowledge distillation (KD) has proven to be a highly effective approach for enhancing
model performance through a teacher-student training scheme. However, most existing …