Tokencut: Segmenting objects in images and videos with self-supervised transformer and normalized cut

Y Wang, X Shen, Y Yuan, Y Du, M Li… - IEEE transactions on …, 2023 - ieeexplore.ieee.org
In this paper, we describe a graph-based algorithm that uses the features obtained by a self-
supervised transformer to detect and segment salient objects in images and videos. With this …

A survey on masked autoencoder for self-supervised learning in vision and beyond

C Zhang, C Zhang, J Song, JSK Yi, K Zhang… - arXiv preprint arXiv …, 2022 - arxiv.org
Masked autoencoders are scalable vision learners, as the title of MAE\cite {he2022masked},
which suggests that self-supervised learning (SSL) in vision might undertake a similar …

Auxiliary tasks benefit 3d skeleton-based human motion prediction

C Xu, RT Tan, Y Tan, S Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Exploring spatial-temporal dependencies from observed motions is one of the core
challenges of human motion prediction. Previous methods mainly focus on dedicated …

Ponder: Point cloud pre-training via neural rendering

D Huang, S Peng, T He, H Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com
We propose a novel approach to self-supervised learning of point cloud representations by
differentiable neural rendering. Motivated by the fact that informative point cloud features …

Firerisk: A remote sensing dataset for fire risk assessment with benchmarks using supervised and self-supervised learning

S Shen, S Seneviratne, X Wanyan… - … Conference on Digital …, 2023 - ieeexplore.ieee.org
In recent decades, wildfires have caused tremendous property losses, fatalities, and
extensive damage to forest ecosystems. Inspired by the abundance of publicly available …

Disentangling spatial and temporal learning for efficient image-to-video transfer learning

Z Qing, S Zhang, Z Huang, Y Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recently, large-scale pre-trained language-image models like CLIP have shown
extraordinary capabilities for understanding spatial contents, but naively transferring such …

Traj-mae: Masked autoencoders for trajectory prediction

H Chen, J Wang, K Shao, F Liu, J Hao… - Proceedings of the …, 2023 - openaccess.thecvf.com
Trajectory prediction has been a crucial task in building a reliable autonomous driving
system by anticipating possible dangers. One key issue is to generate consistent trajectory …

Rethinking tokenizer and decoder in masked graph modeling for molecules

Z Liu, Y Shi, A Zhang, E Zhang… - Advances in …, 2024 - proceedings.neurips.cc
Masked graph modeling excels in the self-supervised representation learning of molecular
graphs. Scrutinizing previous studies, we can reveal a common scheme consisting of three …

Multi-robot scene completion: Towards task-agnostic collaborative perception

Y Li, J Zhang, D Ma, Y Wang… - Conference on Robot …, 2023 - proceedings.mlr.press
Collaborative perception learns how to share information among multiple robots to perceive
the environment better than individually done. Past research on this has been task-specific …

Switchtab: Switched autoencoders are effective tabular learners

J Wu, S Chen, Q Zhao, R Sergazinov, C Li… - Proceedings of the …, 2024 - ojs.aaai.org
Self-supervised representation learning methods have achieved significant success in
computer vision and natural language processing (NLP), where data samples exhibit explicit …