Pet: Parameter-efficient knowledge distillation on transformer

H Jeon, S Park, JG Kim, U Kang - Plos one, 2023 - journals.plos.org
Given a large Transformer model, how can we obtain a small and computationally efficient
model which maintains the performance of the original model? Transformer has shown …

Operator Selection and Ordering in a Pipeline Approach to Efficiency Optimizations for Transformers

J Xin, R Tang, Z Jiang, Y Yu, J Lin - Findings of the Association for …, 2023 - aclanthology.org
There exists a wide variety of efficiency methods for natural language processing (NLP)
tasks, such as pruning, distillation, dynamic inference, quantization, etc. From a different …

Building an efficiency pipeline: Commutativity and cumulativeness of efficiency operators for transformers

J Xin, R Tang, Z Jiang, Y Yu, J Lin - arXiv preprint arXiv:2208.00483, 2022 - arxiv.org
There exists a wide variety of efficiency methods for natural language processing (NLP)
tasks, such as pruning, distillation, dynamic inference, quantization, etc. We can consider an …

Knowledge distillation based contextual relevance matching for e-commerce product search

Z Liu, C Wang, H Feng, L Wu… - Proceedings of the 2022 …, 2022 - aclanthology.org
Online relevance matching is an essential task of e-commerce product search to boost the
utility of search engines and ensure a smooth user experience. Previous work adopts either …

Efficient Inference of Transformers in Natural Language Processing: Early Exiting and Beyond

J Xin - 2023 - uwspace.uwaterloo.ca
Large-scale pre-trained transformer models such as BERT have become ubiquitous in
Natural Language Processing (NLP) research and applications. They bring significant …