所有版本 - 学术资源搜索

文章

学术资源搜索

获得 2 条结果（用时0.02秒）

Knowledge distillation of transformer-based language models revisited

C Lu, J Zhang, Y Chu, Z Chen, J Zhou, F Wu… - arXiv preprint arXiv …, 2022 - arxiv.org

In the past few years, transformer-based pre-trained language models have achieved
astounding success in both industry and academia. However, the large model size and high …

被引用次数：7 相关文章

Knowledge Distillation of Transformer-based Language Models Revisited

C Lu, J Zhang, Y Chu, Z Chen, J Zhou, F Wu… - arXiv e …, 2022 - ui.adsabs.harvard.edu

In the past few years, transformer-based pre-trained language models have achieved
astounding success in both industry and academia. However, the large model size and high …

高级搜索

QQ 群

Knowledge distillation of transformer-based language models revisited

Knowledge Distillation of Transformer-based Language Models Revisited

引用