The emergence of large-scale large language models, with GPT-4 as a prominent example, has significantly propelled the rapid advancement of artificial general intelligence and …
L Wang, M Zhang, X Gao, W Shi - Remote Sensing, 2024 - mdpi.com
Change detection (CD) in remote sensing (RS) imagery is a pivotal method for detecting changes in the Earth's surface, finding wide applications in urban planning, disaster …
We introduce a method to train vision-language models for remote-sensing images without using any textual annotations. Our key insight is to use co-located internet imagery taken on …
C Yang, Z Li, L Zhang - IEEE Transactions on Geoscience and …, 2024 - ieeexplore.ieee.org
Recently, remote sensing image captioning (RSIC) has gained significant attention in the remote sensing community. Due to the significant differences in spatial resolution of remote …
The revolutionary capabilities of large language models (LLMs) have paved the way for multimodal large language models (MLLMs) and fostered diverse applications across …
Abstract This paper introduces Qatent PatFig, a novel large-scale patent figure dataset comprising 30,000+ patent figures from over 11,000 European patent applications. For each …
Image captioning and cross-modal retrieval are examples of tasks that involve the joint analysis of visual and linguistic information. In connection to remote sensing imagery, these …
E Salin, S Ayache, B Favre - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Vision-language foundation models have had considerable increase in performances in the last few years. However, there is still a lack of comprehensive evaluation methods able to …
The foundation model (FM) has garnered significant attention for its remarkable transfer performance in downstream tasks. Typically, it undergoes task-agnostic pretraining on a …