Ape: Aligning pretrained encoders to quickly learn aligned multimodal representations

文章

学术资源搜索

获得 3 条结果（用时0.03秒）

我的图书馆

Ape: Aligning pretrained encoders to quickly learn aligned multimodal representations

在引用文章中搜索

[PDF] thecvf.com

Sam-clip: Merging vision foundation models towards semantic and spatial understanding

H Wang, PKA Vasu, F Faghri… - Proceedings of the …, 2024 - openaccess.thecvf.com

The landscape of publicly available vision foundation models (VFMs) such as CLIP and
SAM is expanding rapidly. VFMs are endowed with distinct capabilities stemming from their …

被引用次数：46 相关文章所有 7 个版本

[PDF] arxiv.org

SeFi-CD: A Semantic First Change Detection Paradigm That Can Detect Any Change You Want

L Zhao, Z Huang, D Kuang, C Peng, J Gan… - arXiv preprint arXiv …, 2024 - arxiv.org

The existing change detection (CD) methods can be summarized as the visual-first change
detection (ViFi-CD) paradigm, which first extracts change features from visual differences …

Advancing Multi-Modal Sensing Through Expandable Modality Alignment

S Dai, S Jiang, Y Yang, T Cao, M Li, S Banerjee… - arXiv preprint arXiv …, 2024 - arxiv.org

Sensing technology is widely used for comprehending the physical world, with numerous
modalities explored in past decades. While there has been considerable work on multi …

高级搜索

QQ 群

Ape: Aligning pretrained encoders to quickly learn aligned multimodal representations

Sam-clip: Merging vision foundation models towards semantic and spatial understanding

SeFi-CD: A Semantic First Change Detection Paradigm That Can Detect Any Change You Want

Advancing Multi-Modal Sensing Through Expandable Modality Alignment

引用