Z Han, C Gao, J Liu, SQ Zhang - arXiv preprint arXiv:2403.14608, 2024 - arxiv.org
Large models represent a groundbreaking advancement in multiple application fields, enabling remarkable achievements across various tasks. However, their unprecedented …
Z Wei, L Chen, Y Jin, X Ma, T Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this paper we first assess and harness various Vision Foundation Models (VFMs) in the context of Domain Generalized Semantic Segmentation (DGSS). Driven by the motivation …
Abstract Visual Parameter-Efficient Fine-Tuning (PEFT) has become a powerful alternative for full fine-tuning so as to adapt pre-trained vision models to downstream tasks, which only …
In the fashion domain, there exists a variety of vision-and-language (V+ L) tasks, including cross-modal retrieval, text-guided image retrieval, multi-modal classification, and image …
Parameter-efficient transfer learning (PETL) is an emerging research spot aimed at inexpensively adapting large-scale pre-trained models to downstream tasks. Recent …
J Park, J Lee, K Sohn - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
In this paper, we efficiently transfer the surpassing representation power of the vision foundation models, such as ViT and Swin, for video understanding with only a few trainable …
ZY Hu, Y Li, MR Lyu, L Wang - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
As the model size of pre-trained language models (PLMs) grows rapidly, full fine-tuning becomes prohibitively expensive for model training and storage. In vision-and-language …
Y Xin, S Luo, H Zhou, J Du, X Liu, Y Fan, Q Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Large-scale pre-trained vision models (PVMs) have shown great potential for adaptability across various downstream vision tasks. However, with state-of-the-art PVMs growing to …
In this paper we study the problem of generalizable synthetic image detection aiming to detect forgery images from diverse generative methods eg GANs and diffusion models …