A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends

J Gui, T Chen, J Zhang, Q Cao, Z Sun… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Deep supervised learning algorithms typically require a large volume of labeled data to
achieve satisfactory performance. However, the process of collecting and labeling such data …

Advancing plain vision transformer toward remote sensing foundation model

D Wang, Q Zhang, Y Xu, J Zhang, B Du… - … on Geoscience and …, 2022 - ieeexplore.ieee.org
Large-scale vision foundation models have made significant progress in visual tasks on
natural images, with vision transformers (ViTs) being the primary choice due to their good …

An empirical study of remote sensing pretraining

D Wang, J Zhang, B Du, GS Xia… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Deep learning has largely reshaped remote sensing (RS) research for aerial image
understanding and made a great success. Nevertheless, most of the existing deep models …

Ddpm-cd: Remote sensing change detection using denoising diffusion probabilistic models

WGC Bandara, NG Nair, VM Patel - arXiv preprint arXiv:2206.11892, 2022 - arxiv.org
Human civilization has an increasingly powerful influence on the earth system, and earth
observations are an invaluable tool for assessing and mitigating the negative impacts. To …

Cmid: A unified self-supervised learning framework for remote sensing image understanding

D Muhtar, X Zhang, P Xiao, Z Li… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Self-supervised learning (SSL) has gained wide-spread attention in the remote sensing (RS)
and Earth observation (EO) communities owing to its ability to learn task-agnostic …

Vitpose++: Vision transformer for generic body pose estimation

Y Xu, J Zhang, Q Zhang, D Tao - IEEE Transactions on Pattern …, 2023 - ieeexplore.ieee.org
In this paper, we show the surprisingly good properties of plain vision transformers for body
pose estimation from various aspects, namely simplicity in model structure, scalability in …

Embedding global contrastive and local location in self-supervised learning

W Zhao, C Li, W Zhang, L Yang… - … on Circuits and …, 2022 - ieeexplore.ieee.org
Self-supervised representation learning (SSL) typically suffers from inadequate data
utilization and feature-specificity due to the suboptimal sampling strategy and the …

ViTPose++: Vision Transformer Foundation Model for Generic Body Pose Estimation

Y Xu, J Zhang, Q Zhang, D Tao - arXiv preprint arXiv:2212.04246, 2022 - arxiv.org
In this paper, we show the surprisingly good properties of plain vision transformers for body
pose estimation from various aspects, namely simplicity in model structure, scalability in …

I3cl: Intra-and inter-instance collaborative learning for arbitrary-shaped scene text detection

B Du, J Ye, J Zhang, J Liu, D Tao - International Journal of Computer …, 2022 - Springer
Existing methods for arbitrary-shaped text detection in natural scenes face two critical
issues, ie,(1) fracture detections at the gaps in a text instance; and (2) inaccurate detections …

LESSL: Can LEGO sampling and collaborative optimization contribute to self-supervised learning?

W Zhao, W Zhang, X Pan, P Zhuang, X Xie, L Li… - Information …, 2022 - Elsevier
Self-supervised visual representation learning (SSL) aims to extract the most distinctive
features from unlabeled datasets to overcome challenges of labor-intensive and time …