Boost vision transformer with gpu-friendly sparsity and quantization

Z Yang, A Zeng, C Yuan, Y Li - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Whole-body pose estimation localizes the human body, hand, face, and foot keypoints in an
image. This task is challenging due to multi-scale body parts, fine-grained localization for …

被引用次数：86 相关文章所有 5 个版本

[PDF] ieee.org

A survey on efficient vision transformers: algorithms, techniques, and performance benchmarking

L Papa, P Russo, I Amerini… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Vision Transformer (ViT) architectures are becoming increasingly popular and widely
employed to tackle computer vision applications. Their main feature is the capacity to extract …

被引用次数：12 相关文章所有 7 个版本

[PDF] arxiv.org

Model quantization and hardware acceleration for vision transformers: A comprehensive survey

D Du, G Gong, X Chu - arXiv preprint arXiv:2405.00314, 2024 - arxiv.org

Vision Transformers (ViTs) have recently garnered considerable attention, emerging as a
promising alternative to convolutional neural networks (CNNs) in several vision-related …

被引用次数：3 相关文章所有 2 个版本

Sparsity in transformers: A systematic literature review

M Farina, U Ahmad, A Taha, H Younes, Y Mesbah… - Neurocomputing, 2024 - Elsevier

Transformers have become the state-of-the-art architectures for various tasks in Natural
Language Processing (NLP) and Computer Vision (CV); however, their space and …

被引用次数：10 相关文章

[PDF] arxiv.org

Efficient multimodal large language models: A survey

Y Jin, J Li, Y Liu, T Gu, K Wu, Z Jiang, M He… - arXiv preprint arXiv …, 2024 - arxiv.org

In the past year, Multimodal Large Language Models (MLLMs) have demonstrated
remarkable performance in tasks such as visual question answering, visual understanding …

被引用次数：12 相关文章所有 2 个版本

[PDF] arxiv.org

Effective Interplay between Sparsity and Quantization: From Theory to Practice

SB Harma, A Chakraborty, E Kostenok… - arXiv preprint arXiv …, 2024 - arxiv.org

The increasing size of deep neural networks necessitates effective model compression to
improve computational efficiency and reduce their memory footprint. Sparsity and …

被引用次数：2 相关文章

[PDF] thecvf.com

RCV2023 Challenges: Benchmarking Model Training and Inference for Resource-Constrained Deep Learning

R Tiwari, A Chavan, D Gupta, G Mago… - Proceedings of the …, 2023 - openaccess.thecvf.com

This paper delves into the results of two resource-constrained deep learning challenges,
part of the workshop on Resource-Efficient Deep Learning for Computer Vision (RCV) at …

A Simple Low-bit Quantization Framework for Video Snapshot Compressive Imaging

M Cao, L Wang, H Wang, X Yuan - arXiv preprint arXiv:2407.21517, 2024 - arxiv.org

Video Snapshot Compressive Imaging (SCI) aims to use a low-speed 2D camera to capture
high-speed scene as snapshot compressed measurements, followed by a reconstruction …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks

X Geng, Z Wang, C Chen, Q Xu, K Xu… - … on Neural Networks …, 2024 - ieeexplore.ieee.org

Deep neural networks (DNNs) have been widely used in many artificial intelligence (AI)
tasks. However, deploying them brings significant challenges due to the huge cost of …

Advances in Pruning and Quantization for Natural Language Processing

U Bibi, M Mazhar, D Sabir, MFU Butt, A Hassan… - IEEE …, 2024 - ieeexplore.ieee.org

With ongoing advancements in natural language processing (NLP) and deep learning
methods, the demand for computational and memory resources has considerably increased …

高级搜索

QQ 群