OMG-Seg: Is one model good enough for all segmentation?

X Li, H Yuan, W Li, H Ding, S Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this work we address various segmentation tasks each traditionally tackled by distinct or
partially unified models. We propose OMG-Seg One Model that is Good enough to efficiently …

Mimic before reconstruct: Enhancing masked autoencoders with feature mimicking

P Gao, Z Lin, R Zhang, R Fang, H Li, H Li… - International Journal of …, 2024 - Springer
Masked Autoencoders (MAE) have been popular paradigms for large-scale vision
representation pre-training. However, MAE solely reconstructs the low-level RGB signals …

Pre-training with random orthogonal projection image modeling

M Haghighat, P Moghadam, S Mohamed… - arXiv preprint arXiv …, 2023 - arxiv.org
Masked Image Modeling (MIM) is a powerful self-supervised strategy for visual pre-training
without the use of labels. MIM applies random crops to input images, processes them with …

SCE-MAE: Selective Correspondence Enhancement with Masked Autoencoder for Self-Supervised Landmark Estimation

K Yin, V Rao, R Jiang, X Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Self-supervised landmark estimation is a challenging task that demands the formation of
locally distinct feature representations to identify sparse facial landmarks in the absence of …

Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent Work

Q Wang, Y Yin - arXiv preprint arXiv:2306.01929, 2023 - arxiv.org
Inspired by the fact that human brains can emphasize discriminative parts of the input and
suppress irrelevant ones, substantial local mechanisms have been designed to boost the …

w2v-SELD: A Sound Event Localization and Detection Framework for Self-Supervised Spatial Audio Pre-Training

OL Santos, K Rosero, RA Lotufo - arXiv preprint arXiv:2312.06907, 2023 - arxiv.org
Sound Event Detection and Localization (SELD) constitutes a complex task that depends on
extensive multichannel audio recordings with annotated sound events and their respective …