Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism. Thanks to its strong representation …
Transformer has shown excellent performance in remote sensing field with long-range modeling capabilities. Remote sensing video (RSV) moving object detection and tracking …
In the past decade, advances in deep learning have resulted in breakthroughs in a variety of areas, including computer vision, natural language understanding, speech recognition, and …
We investigate the potential of GPT-4~\cite {gpt4} to perform Neural Architecture Search (NAS)--the task of designing effective neural architectures. Our proposed approach,\textbf …
Y Cao, Q Tang, F Yang, X Su, S You… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Human-Object Interaction (HOI) detection is a challenging computer vision task that requires visual models to address the complex interactive relationship between humans and …
This paper explores the feasibility of finding an optimal sub-model from a vision transformer and introduces a pure vision transformer slimming (ViT-Slim) framework. It can search a sub …
Transformer-based Deep Neural Network architectures have gained tremendous interest due to their effectiveness in various applications across Natural Language Processing (NLP) …
C Gong, D Wang - ICLR Proceedings 2022, 2022 - par.nsf.gov
Designing accurate and efficient vision transformers (ViTs) is an important but challenging task. Supernet-based one-shot neural architecture search (NAS) enables fast architecture …
T Huang, X Pei, S You, F Wang, C Qian… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advancements in state space models, notably Mamba, have demonstrated significant progress in modeling long sequences for tasks like language understanding. Yet …