Multi-view vehicle detection based on fusion part model with active learning

DL Li, M Prasad, CL Liu, CT Lin - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Computer vision-based vehicle detection techniques are widely used in real-world
applications. However, most of these techniques aim to detect only single-view vehicles, and …

Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach

W Dong, X Zhang, B Chen, D Yan… - Proceedings of the …, 2024 - openaccess.thecvf.com
Parameter-efficient fine-tuning for pre-trained Vision Transformers aims to adeptly tailor a
model to downstream tasks by learning a minimal set of new adaptation parameters while …

LSPT: Long-term Spatial Prompt Tuning for Visual Representation Learning

S Mo, Y Wang, X Luo, D Li - arXiv preprint arXiv:2402.17406, 2024 - arxiv.org
Visual Prompt Tuning (VPT) techniques have gained prominence for their capacity to adapt
pre-trained Vision Transformers (ViTs) to downstream visual tasks using specialized …

DIMA: Digging into Multigranular Archetype for Fine-Grained Object Detection

J Cheng, X Yao, X Yang, X Yuan… - … on Geoscience and …, 2024 - ieeexplore.ieee.org
Fine-grained remote sensing object detection aims at precisely locating objects and
determining the fine-level categories. This task is exceptionally challenging due to the …

SA²VP: Spatially Aligned-and-Adapted Visual Prompt

W Pei, T Xia, F Chen, J Li, J Tian, G Lu - Proceedings of the AAAI …, 2024 - ojs.aaai.org
As a prominent parameter-efficient fine-tuning technique in NLP, prompt tuning is being
explored its potential in computer vision. Typical methods for visual prompt tuning follow the …

Image-based surrogates of socio-economic status in urban neighborhoods using deep multiple instance learning

C Diou, P Lelekas, A Delopoulos - Journal of imaging, 2018 - mdpi.com
(1) Background: Evidence-based policymaking requires data about the local population's
socioeconomic status (SES) at detailed geographical level, however, such information is …

Do we really need a large number of visual prompts?

Y Kim, Y Li, A Moitra, R Yin, P Panda - Neural Networks, 2024 - Elsevier
Due to increasing interest in adapting models on resource-constrained edges, parameter-
efficient transfer learning has been widely explored. Among various methods, Visual Prompt …

Rethinking nearest neighbors for visual classification

M Jia, BC Chen, Z Wu, C Cardie, S Belongie… - arXiv preprint arXiv …, 2021 - arxiv.org
Neural network classifiers have become the de-facto choice for current" pre-train then fine-
tune" paradigms of visual classification. In this paper, we investigate k-Nearest-Neighbor (k …

Mini but Mighty: Finetuning ViTs with Mini Adapters

IE Marouf, E Tartaglione… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Vision Transformers (ViTs) have become one of the dominant architectures in
computer vision, and pre-trained ViT models are commonly adapted to new tasks via fine …

The Dollar Street dataset: Images representing the geographic and socioeconomic diversity of the world

W Gaviria Rojas, S Diamos, K Kini… - Advances in …, 2022 - proceedings.neurips.cc
It is crucial that image datasets for computer vision are representative and contain accurate
demographic information to ensure their robustness and fairness, especially for smaller …