J Zhang, Y Gao, M Zhou, R Liu, X Cheng… - Available at SSRN … - papers.ssrn.com
Recent advancement of Knowledge distillation (KD) is to extract and transfer middle layer
knowledge of teacher models to student models, which is better than original KDs which …