Segmenting or detecting objects in sparse Lidar point clouds are two important tasks in autonomous driving to allow a vehicle to act safely in its 3D environment. The best …
Transformer has been widely used for self-supervised pre-training in Natural Language Processing (NLP) and achieved great success. However, it has not been fully explored in …
C Tao, X Zhu, W Su, G Huang, B Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Self-supervised learning (SSL) has delivered superior performance on a variety of downstream vision tasks. Two main-stream SSL frameworks have been proposed, ie …
Humans learn powerful representations of objects and scenes by observing how they evolve over time. Yet, outside of specific tasks that require explicit temporal understanding, static …
Abstract Masked Image Modeling (MIM) has recently been established as a potent pre- training paradigm. A pretext task is constructed by masking patches in an input image, and …
The contrastive pre-training of a recognition model on a large dataset of unlabeled data often boosts the model's performance on downstream tasks like image classification …
Medical image segmentation has seen significant progress through the use of supervised deep learning. Hereby, large annotated datasets were employed to reliably segment …
We present a novel approach to unsupervised learning for video object segmentation (VOS). Unlike previous work, our formulation allows to learn dense feature representations directly …
S Venkataramanan, E Kijak… - Advances in neural …, 2024 - proceedings.neurips.cc
Mixup refers to interpolation-based data augmentation, originally motivated as a way to go beyond empirical risk minimization (ERM). Its extensions mostly focus on the definition of …