Vitas: Vision transformer architecture search

KT Chitty-Venkata, S Mittal, M Emani… - Journal of Systems …, 2023 - Elsevier

Recent years have seen a phenomenal rise in the performance and applications of
transformer neural networks. The family of transformer networks, including Bidirectional …

被引用次数：22 相关文章所有 6 个版本

[PDF] baai.ac.cn

A survey on vision transformer

K Han, Y Wang, H Chen, X Chen, J Guo… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

被引用次数：1677 相关文章所有 7 个版本

[PDF] ieee.org

Transformer meets remote sensing video detection and tracking: A comprehensive survey

L Jiao, X Zhang, X Liu, F Liu, S Yang… - IEEE Journal of …, 2023 - ieeexplore.ieee.org

Transformer has shown excellent performance in remote sensing field with long-range
modeling capabilities. Remote sensing video (RSV) moving object detection and tracking …

被引用次数：11 相关文章所有 2 个版本

[PDF] arxiv.org

Neural architecture search: Insights from 1000 papers

C White, M Safari, R Sukthanker, B Ru, T Elsken… - arXiv preprint arXiv …, 2023 - arxiv.org

In the past decade, advances in deep learning have resulted in breakthroughs in a variety of
areas, including computer vision, natural language understanding, speech recognition, and …

被引用次数：68 相关文章所有 2 个版本

[PDF] arxiv.org

Can gpt-4 perform neural architecture search?

M Zheng, X Su, S You, F Wang, C Qian, C Xu… - arXiv preprint arXiv …, 2023 - arxiv.org

We investigate the potential of GPT-4~\cite {gpt4} to perform Neural Architecture Search
(NAS)--the task of designing effective neural architectures. Our proposed approach,\textbf …

被引用次数：49 相关文章所有 3 个版本

[PDF] thecvf.com

Re-mine, learn and reason: Exploring the cross-modal semantic correlations for language-guided hoi detection

Y Cao, Q Tang, F Yang, X Su, S You… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Human-Object Interaction (HOI) detection is a challenging computer vision task that
requires visual models to address the complex interactive relationship between humans and …

被引用次数：12 相关文章所有 5 个版本

[PDF] thecvf.com

Vision transformer slimming: Multi-dimension searching in continuous optimization space

A Chavan, Z Shen, Z Liu, Z Liu… - Proceedings of the …, 2022 - openaccess.thecvf.com

This paper explores the feasibility of finding an optimal sub-model from a vision transformer
and introduces a pure vision transformer slimming (ViT-Slim) framework. It can search a sub …

被引用次数：59 相关文章所有 8 个版本

[PDF] ieee.org

Neural architecture search for transformers: A survey

KT Chitty-Venkata, M Emani, V Vishwanath… - IEEE …, 2022 - ieeexplore.ieee.org

Transformer-based Deep Neural Network architectures have gained tremendous interest
due to their effectiveness in various applications across Natural Language Processing (NLP) …

被引用次数：46 相关文章所有 5 个版本

[PDF] nsf.gov

[PDF][PDF] Nasvit: Neural architecture search for efficient vision transformers with gradient conflict-aware supernet training

C Gong, D Wang - ICLR Proceedings 2022, 2022 - par.nsf.gov

Designing accurate and efficient vision transformers (ViTs) is an important but challenging
task. Supernet-based one-shot neural architecture search (NAS) enables fast architecture …

被引用次数：64 相关文章所有 2 个版本

[PDF] arxiv.org

Localmamba: Visual state space model with windowed selective scan

T Huang, X Pei, S You, F Wang, C Qian… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent advancements in state space models, notably Mamba, have demonstrated
significant progress in modeling long sequences for tasks like language understanding. Yet …

被引用次数：28 相关文章所有 2 个版本

高级搜索

QQ 群