Bag of tricks for optimizing transformer efficiency

H Jeon, S Park, JG Kim, U Kang - Plos one, 2023 - journals.plos.org

Given a large Transformer model, how can we obtain a small and computationally efficient
model which maintains the performance of the original model? Transformer has shown …

被引用次数：6 相关文章所有 8 个版本

[PDF] aclanthology.org

Operator Selection and Ordering in a Pipeline Approach to Efficiency Optimizations for Transformers

J Xin, R Tang, Z Jiang, Y Yu, J Lin - Findings of the Association for …, 2023 - aclanthology.org

There exists a wide variety of efficiency methods for natural language processing (NLP)
tasks, such as pruning, distillation, dynamic inference, quantization, etc. From a different …

Building an efficiency pipeline: Commutativity and cumulativeness of efficiency operators for transformers

J Xin, R Tang, Z Jiang, Y Yu, J Lin - arXiv preprint arXiv:2208.00483, 2022 - arxiv.org

There exists a wide variety of efficiency methods for natural language processing (NLP)
tasks, such as pruning, distillation, dynamic inference, quantization, etc. We can consider an …

被引用次数：2 相关文章所有 2 个版本

[PDF] aclanthology.org

Knowledge distillation based contextual relevance matching for e-commerce product search

Z Liu, C Wang, H Feng, L Wu… - Proceedings of the 2022 …, 2022 - aclanthology.org

Online relevance matching is an essential task of e-commerce product search to boost the
utility of search engines and ensure a smooth user experience. Previous work adopts either …

被引用次数：5 相关文章所有 3 个版本

[PDF] uwaterloo.ca

Efficient Inference of Transformers in Natural Language Processing: Early Exiting and Beyond

J Xin - 2023 - uwspace.uwaterloo.ca

Large-scale pre-trained transformer models such as BERT have become ubiquitous in
Natural Language Processing (NLP) research and applications. They bring significant …

高级搜索

QQ 群