A compression-compilation framework for on-mobile real-time bert applications

文章

学术资源搜索

获得 3 条结果（用时0.01秒）

我的图书馆

A compression-compilation framework for on-mobile real-time bert applications

在引用文章中搜索

[PDF] arxiv.org

Spvit: Enabling faster vision transformers via latency-aware soft token pruning

Z Kong, P Dong, X Ma, X Meng, W Niu, M Sun… - European conference on …, 2022 - Springer

Abstract Recently, Vision Transformer (ViT) has continuously established new milestones in
the computer vision field, while the high computation and memory cost makes its …

被引用次数：129 相关文章所有 6 个版本

[PDF] arxiv.org

Quantized transformer language model implementations on edge devices

MWU Rahman, MM Abrar, HG Copening… - 2023 International …, 2023 - ieeexplore.ieee.org

Large-scale transformer-based models like the Bidi-rectional Encoder Representations from
Transformers (BERT) are widely used for Natural Language Processing (NLP) applications …

被引用次数：7 相关文章所有 3 个版本

[PDF] arizona.edu

Optimizing Large Language Models for Edge Devices: A Comparative Study on Reputation Analysis

MWU Rahman - 2023 - search.proquest.com

The widespread adoption of social media platforms has led to an exponential surge in user-
generated data, shaping the reputations of companies and public figures on a global scale …

高级搜索

QQ 群

A compression-compilation framework for on-mobile real-time bert applications

Spvit: Enabling faster vision transformers via latency-aware soft token pruning

Quantized transformer language model implementations on edge devices

Optimizing Large Language Models for Edge Devices: A Comparative Study on Reputation Analysis

引用