Homodistil: Homotopic task-agnostic distillation of pre-trained transformers

C Liang, H Jiang, Z Li, X Tang, B Yin, T Zhao - arXiv preprint arXiv …, 2023 - arxiv.org
Knowledge distillation has been shown to be a powerful model compression approach to
facilitate the deployment of pre-trained language models in practice. This paper focuses on …

HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers

C Liang, H Jiang, Z Li, X Tang, B Yin, T Zhao - arXiv e-prints, 2023 - ui.adsabs.harvard.edu
Abstract Knowledge distillation has been shown to be a powerful model compression
approach to facilitate the deployment of pre-trained language models in practice. This paper …

HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers

C Liang, H Jiang, Z Li, X Tang, B Yin, T Zhao - The Eleventh International … - openreview.net
Knowledge distillation has been shown to be a powerful model compression approach to
facilitate the deployment of pre-trained language models in practice. This paper focuses on …

HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers

C Liang, H Jiang, Z Li, X Tang, B Yin, T Zhao - 2023 - iclr.cc
HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers Page 1 HomoDistil:
Homotopic Task-Agnostic Distillation of Pre-trained Transformers Chen Liang*, Haoming …