Sam-clip: Merging vision foundation models towards semantic and spatial understanding

H Wang, PKA Vasu, F Faghri… - Proceedings of the …, 2024 - openaccess.thecvf.com
The landscape of publicly available vision foundation models (VFMs) such as CLIP and
SAM is expanding rapidly. VFMs are endowed with distinct capabilities stemming from their …

SeFi-CD: A Semantic First Change Detection Paradigm That Can Detect Any Change You Want

L Zhao, Z Huang, D Kuang, C Peng, J Gan… - arXiv preprint arXiv …, 2024 - arxiv.org
The existing change detection (CD) methods can be summarized as the visual-first change
detection (ViFi-CD) paradigm, which first extracts change features from visual differences …

Advancing Multi-Modal Sensing Through Expandable Modality Alignment

S Dai, S Jiang, Y Yang, T Cao, M Li, S Banerjee… - arXiv preprint arXiv …, 2024 - arxiv.org
Sensing technology is widely used for comprehending the physical world, with numerous
modalities explored in past decades. While there has been considerable work on multi …