Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning?

C Han, Q Wang, Y Cui, W Wang, L Huang, S Qi… - arXiv preprint arXiv …, 2024 - arxiv.org
As the scale of vision models continues to grow, the emergence of Visual Prompt Tuning
(VPT) as a parameter-efficient transfer learning technique has gained attention due to its …

D3C2-Net: Dual-Domain Deep Convolutional Coding Network for Compressive Sensing

W Li, B Chen, S Liu, S Zhao, B Du… - … on Circuits and …, 2024 - ieeexplore.ieee.org
By mapping iterative optimization algorithms into neural networks (NNs), deep unfolding
networks (DUNs) exhibit well-defined and interpretable structures and achieve remarkable …

Ptm: Torus masking for 3d representation learning guided by robust and trusted teachers

H Cheng, J Zhu, N Hu, J Chen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
3D Masked Point Modeling (MPM) typically involves randomly or blockly discarding points or
patches and then reconstructing them, offering a promising avenue for exploring geometric …

Outdoor activity classification using smartphone based inertial sensor measurements

R Bodhe, S Sivakumar, G Sakarkar, FH Juwono… - Multimedia Tools and …, 2024 - Springer
Abstract Human Activity Recognition (HAR) deals with the automatic recognition of physical
activities and plays a crucial role in healthcare and sports where wearable sensors and …

Physically-guided open vocabulary segmentation with weighted patched alignment loss

W Liu, J Lou, X Wang, W Zhou, J Cheng, X Yang - Neurocomputing, 2025 - Elsevier
Open vocabulary segmentation is a challenging task that aims to segment out the thousands
of unseen categories. Directly applying CLIP to open-vocabulary semantic segmentation is …

Robust Discriminative t-Linear Subspace Learning for Image Feature Extraction

K Liu, X Xiao, J You, Y Zhou - IEEE Transactions on Circuits …, 2024 - ieeexplore.ieee.org
Subspace learning has been widely applied for joint feature extraction and dimensionality
reduction, demonstrating significant efficacy. Numerous subspace learning methods with …

Prototypical Transformer as Unified Motion Learners

C Han, Y Lu, G Sun, JC Liang, Z Cao, Q Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
In this work, we introduce the Prototypical Transformer (ProtoFormer), a general and unified
framework that approaches various motion tasks from a prototype perspective. ProtoFormer …

Does Pixel Value Represent Facial Landmark Well in Heatmap?

X Lan, J Lyu, K Dong, H Jiang, Q Hu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Heatmap-based methods have dominated the face alignment task, yet the maximum
response decoding scheme necessitates further reform. While some studies have attempted …

Learning Unknowns from Unknowns: Diversified Negative Prototypes Generator for Few-Shot Open-Set Recognition

Z Zhang, G Chen, Y Zou, Y Li, R Li - Proceedings of the 32nd ACM …, 2024 - dl.acm.org
Few-shot open-set recognition (FSOR) is a challenging task that requires a model to
recognize known classes and identify unknown classes with limited labeled data. Existing …

SimpliFusion: a simplified infrared and visible image fusion network

Y Liu, X Li, W Zhong - The Visual Computer, 2024 - Springer
This paper introduces SimpliFusion, a network designed for the fusion of infrared and visible
images, leveraging a simplified transformer architecture. SimpliFusion is engineered to …