A survey of techniques for optimizing transformer inference

KT Chitty-Venkata, S Mittal, M Emani… - Journal of Systems …, 2023 - Elsevier
Recent years have seen a phenomenal rise in the performance and applications of
transformer neural networks. The family of transformer networks, including Bidirectional …

A survey on vision transformer

K Han, Y Wang, H Chen, X Chen, J Guo… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

Transformer meets remote sensing video detection and tracking: A comprehensive survey

L Jiao, X Zhang, X Liu, F Liu, S Yang… - IEEE Journal of …, 2023 - ieeexplore.ieee.org
Transformer has shown excellent performance in remote sensing field with long-range
modeling capabilities. Remote sensing video (RSV) moving object detection and tracking …

Neural architecture search: Insights from 1000 papers

C White, M Safari, R Sukthanker, B Ru, T Elsken… - arXiv preprint arXiv …, 2023 - arxiv.org
In the past decade, advances in deep learning have resulted in breakthroughs in a variety of
areas, including computer vision, natural language understanding, speech recognition, and …

Can gpt-4 perform neural architecture search?

M Zheng, X Su, S You, F Wang, C Qian, C Xu… - arXiv preprint arXiv …, 2023 - arxiv.org
We investigate the potential of GPT-4~\cite {gpt4} to perform Neural Architecture Search
(NAS)--the task of designing effective neural architectures. Our proposed approach,\textbf …

Re-mine, learn and reason: Exploring the cross-modal semantic correlations for language-guided hoi detection

Y Cao, Q Tang, F Yang, X Su, S You… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Human-Object Interaction (HOI) detection is a challenging computer vision task that
requires visual models to address the complex interactive relationship between humans and …

Vision transformer slimming: Multi-dimension searching in continuous optimization space

A Chavan, Z Shen, Z Liu, Z Liu… - Proceedings of the …, 2022 - openaccess.thecvf.com
This paper explores the feasibility of finding an optimal sub-model from a vision transformer
and introduces a pure vision transformer slimming (ViT-Slim) framework. It can search a sub …

Neural architecture search for transformers: A survey

KT Chitty-Venkata, M Emani, V Vishwanath… - IEEE …, 2022 - ieeexplore.ieee.org
Transformer-based Deep Neural Network architectures have gained tremendous interest
due to their effectiveness in various applications across Natural Language Processing (NLP) …

[PDF][PDF] Nasvit: Neural architecture search for efficient vision transformers with gradient conflict-aware supernet training

C Gong, D Wang - ICLR Proceedings 2022, 2022 - par.nsf.gov
Designing accurate and efficient vision transformers (ViTs) is an important but challenging
task. Supernet-based one-shot neural architecture search (NAS) enables fast architecture …

Localmamba: Visual state space model with windowed selective scan

T Huang, X Pei, S You, F Wang, C Qian… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advancements in state space models, notably Mamba, have demonstrated
significant progress in modeling long sequences for tasks like language understanding. Yet …