Knowledge diffusion for distillation

T Huang, Y Zhang, M Zheng, S You… - Advances in …, 2024 - proceedings.neurips.cc
The representation gap between teacher and student is an emerging topic in knowledge
distillation (KD). To reduce the gap and improve the performance, current methods often …

FreeKD: Knowledge Distillation via Semantic Frequency Prompt

Y Zhang, T Huang, J Liu, T Jiang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Knowledge distillation (KD) has been applied to various tasks successfully and
mainstream methods typically boost the student model via spatial imitation losses. However …

Cloud-Device Collaborative Learning for Multimodal Large Language Models

G Wang, J Liu, C Li, Y Zhang, J Ma… - Proceedings of the …, 2024 - openaccess.thecvf.com
The burgeoning field of Multimodal Large Language Models (MLLMs) has exhibited
remarkable performance in diverse tasks such as captioning commonsense reasoning and …

Promoting CNNs with Cross-Architecture Knowledge Distillation for Efficient Monocular Depth Estimation

Z Zheng, T Huang, G Li, Z Wang - arXiv preprint arXiv:2404.16386, 2024 - arxiv.org
Recently, the performance of monocular depth estimation (MDE) has been significantly
boosted with the integration of transformer models. However, the transformer models are …

Teaching with Uncertainty: Unleashing the Potential of Knowledge Distillation in Object Detection

J Yi, J Mao, T Liu, M Li, H Gu, H Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Knowledge distillation (KD) is a widely adopted and effective method for compressing
models in object detection tasks. Particularly, feature-based distillation methods have shown …

Razor SNN: Efficient Spiking Neural Network with Temporal Embeddings

Y Zhang, J Cao, J Chen, W Sun, Y Wang - International Conference on …, 2023 - Springer
The event streams generated by dynamic vision sensors (DVS) are sparse and non-uniform
in the spatial domain, while still dense and redundant in the temporal domain. Although …