所有版本 - 学术资源搜索

文章

学术资源搜索

获得 4 条结果（用时0.03秒）

Towards efficient pre-trained language model via feature correlation distillation

K Huang, X Guo, M Wang - Advances in Neural Information …, 2023 - proceedings.neurips.cc

Abstract Knowledge Distillation (KD) has emerged as a promising approach for compressing
large Pre-trained Language Models (PLMs). The performance of KD relies on how to …

被引用次数：1 相关文章

Towards Efficient Pre-Trained Language Model via Feature Correlation Distillation

K Huang, X Guo, M Wang - Thirty-seventh Conference on Neural … - openreview.net

Knowledge Distillation (KD) has emerged as a promising approach for compressing large
Pre-trained Language Models (PLMs). The performance of KD relies on how to effectively …

[PDF] neurips.cc

[PDF][PDF] Towards Efficient Pre-Trained Language Model via Feature Correlation Distillation

K Huang, X Guo, M Wang - papers.neurips.cc

Abstract Knowledge Distillation (KD) has emerged as a promising approach for compressing
large Pre-trained Language Models (PLMs). The performance of KD relies on how to …

Towards efficient pre-trained language model via feature correlation distillation

K Huang, X Guo, M Wang - … of the 37th International Conference on …, 2023 - dl.acm.org

Knowledge Distillation (KD) has emerged as a promising approach for compressing large
Pre-trained Language Models (PLMs). The performance of KD relies on how to effectively …

高级搜索

QQ 群

Towards efficient pre-trained language model via feature correlation distillation

Towards Efficient Pre-Trained Language Model via Feature Correlation Distillation

[PDF][PDF] Towards Efficient Pre-Trained Language Model via Feature Correlation Distillation

Towards efficient pre-trained language model via feature correlation distillation

引用