Vaqf: Fully automatic software-hardware co-design framework for low-bit vision transformer

S Jamil, M Jalil Piran, OJ Kwon - Drones, 2023 - mdpi.com

As a special type of transformer, vision transformers (ViTs) can be used for various computer
vision (CV) applications. Convolutional neural networks (CNNs) have several potential …

被引用次数：34 相关文章所有 8 个版本

[PDF] neurips.cc

M³vit: Mixture-of-experts vision transformer for efficient multi-task learning with model-accelerator co-design

Z Fan, R Sarkar, Z Jiang, T Chen… - Advances in …, 2022 - proceedings.neurips.cc

Multi-task learning (MTL) encapsulates multiple learned tasks in a single model and often
lets those tasks learn better jointly. Multi-tasking models have become successful and often …

被引用次数：50 相关文章所有 7 个版本

[PDF] arxiv.org

Ppt: token-pruned pose transformer for monocular and multi-view human pose estimation

H Ma, Z Wang, Y Chen, D Kong, L Chen, X Liu… - … on Computer Vision, 2022 - Springer

Recently, the vision transformer and its variants have played an increasingly important role
in both monocular and multi-view human pose estimation. Considering image patches as …

被引用次数：50 相关文章所有 6 个版本

[PDF] arxiv.org

Vitcod: Vision transformer acceleration via dedicated algorithm and accelerator co-design

H You, Z Sun, H Shi, Z Yu, Y Zhao… - … Symposium on High …, 2023 - ieeexplore.ieee.org

Vision Transformers (ViTs) have achieved state-of-the-art performance on various vision
tasks. However, ViTs' self-attention module is still arguably a major bottleneck, limiting their …

被引用次数：55 相关文章所有 6 个版本

[PDF] arxiv.org

Deep convolutional pooling transformer for deepfake detection

T Wang, H Cheng, KP Chow, L Nie - ACM transactions on multimedia …, 2023 - dl.acm.org

Recently, Deepfake has drawn considerable public attention due to security and privacy
concerns in social media digital forensics. As the wildly spreading Deepfake videos on the …

被引用次数：43 相关文章所有 4 个版本

[PDF] arxiv.org

Model quantization and hardware acceleration for vision transformers: A comprehensive survey

D Du, G Gong, X Chu - arXiv preprint arXiv:2405.00314, 2024 - arxiv.org

Vision Transformers (ViTs) have recently garnered considerable attention, emerging as a
promising alternative to convolutional neural networks (CNNs) in several vision-related …

被引用次数：3 相关文章所有 2 个版本

[PDF] mdpi.com

A hybrid model for driver emotion detection using feature fusion approach

SB Sukhavasi, SB Sukhavasi, K Elleithy… - International journal of …, 2022 - mdpi.com

Machine and deep learning techniques are two branches of artificial intelligence that have
proven very efficient in solving advanced human problems. The automotive industry is …

被引用次数：29 相关文章所有 17 个版本

[PDF] arxiv.org

ViTA: A vision transformer inference accelerator for edge applications

S Nag, G Datta, S Kundu… - … on Circuits and …, 2023 - ieeexplore.ieee.org

Vision Transformer models, such as ViT, Swin Transformer, and Transformer-in-Transformer,
have recently gained significant traction in computer vision tasks due to their ability to …

被引用次数：21 相关文章所有 5 个版本

[PDF] acm.org

Tron: Transformer neural network acceleration with non-coherent silicon photonics

S Afifi, F Sunny, M Nikdast, S Pasricha - Proceedings of the Great Lakes …, 2023 - dl.acm.org

Transformer neural networks are rapidly being integrated into state-of-the-art solutions for
natural language processing (NLP) and computer vision. However, the complex structure of …

被引用次数：16 相关文章所有 4 个版本

[PDF] arxiv.org

Edge-moe: Memory-efficient multi-task vision transformer architecture with task-level sparsity via mixture-of-experts

R Sarkar, H Liang, Z Fan, Z Wang… - 2023 IEEE/ACM …, 2023 - ieeexplore.ieee.org

The computer vision community is embracing two promising learning paradigms: the Vision
Transformer (ViT) and Multi-task Learning (MTL). ViT models show extraordinary …

被引用次数：7 相关文章所有 3 个版本

高级搜索

QQ 群