Online model distillation for efficient video inference. In 2019 IEEE

H Jiang, S Liu, J Wang, X Wang - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

While predicting robot grasps with parallel jaw grippers have been well studied and widely
applied in robot manipulation tasks, the study on natural human grasp generation with a …

被引用次数：173 相关文章所有 9 个版本

[PDF] arxiv.org

Self-supervised policy adaptation during deployment

N Hansen, R Jangir, Y Sun, G Alenyà, P Abbeel… - arXiv preprint arXiv …, 2020 - arxiv.org

In most real world scenarios, a policy trained by reinforcement learning in one environment
needs to be deployed in another, potentially quite different environment. However …

被引用次数：174 相关文章所有 13 个版本

[PDF] arxiv.org

It's always personal: Using early exits for efficient on-device CNN personalisation

I Leontiadis, S Laskaridis, SI Venieris… - Proceedings of the 22nd …, 2021 - dl.acm.org

On-device machine learning is becoming a reality thanks to the availability of powerful
hardware and model compression techniques. Typically, these models are pretrained on …

被引用次数：29 相关文章所有 4 个版本

[PDF] arxiv.org

Gradient knowledge distillation for pre-trained language models

L Wang, L Li, X Sun - arXiv preprint arXiv:2211.01071, 2022 - arxiv.org

Knowledge distillation (KD) is an effective framework to transfer knowledge from a large-
scale teacher to a compact yet well-performing student. Previous KD practices for pre …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Pool of Experts: Realtime Querying Specialized Knowledge in Massive Neural Networks

H Kim, DW Choi - Proceedings of the 2021 International Conference on …, 2021 - dl.acm.org

In spite of the great success of deep learning technologies, training and delivery of a
practically serviceable model is still a highly time-consuming process. Furthermore, a …

被引用次数：2 相关文章所有 5 个版本

[PDF] github.io

[PDF][PDF] Gradient Knowledge Distillation for Pre-trained Language Models

LL Lean Wang, X Sun - neurips2022-enlsp.github.io

Abstract Knowledge distillation (KD) is an effective framework to transfer knowledge from a
large-scale teacher to a compact yet well-performing student. Previous KD practices for pre …

高级搜索

QQ 群