Masked autoencoders as spatiotemporal learners

Tokencut: Segmenting objects in images and videos with self-supervised transformer and normalized cut

Y Wang, X Shen, Y Yuan, Y Du, M Li… - IEEE transactions on …, 2023 - ieeexplore.ieee.org

In this paper, we describe a graph-based algorithm that uses the features obtained by a self-
supervised transformer to detect and segment salient objects in images and videos. With this …

被引用次数：53 相关文章所有 29 个版本

[PDF] arxiv.org

A survey on masked autoencoder for self-supervised learning in vision and beyond

C Zhang, C Zhang, J Song, JSK Yi, K Zhang… - arXiv preprint arXiv …, 2022 - arxiv.org

Masked autoencoders are scalable vision learners, as the title of MAE\cite {he2022masked},
which suggests that self-supervised learning (SSL) in vision might undertake a similar …

被引用次数：64 相关文章所有 2 个版本

[PDF] thecvf.com

Auxiliary tasks benefit 3d skeleton-based human motion prediction

C Xu, RT Tan, Y Tan, S Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Exploring spatial-temporal dependencies from observed motions is one of the core
challenges of human motion prediction. Previous methods mainly focus on dedicated …

被引用次数：11 相关文章所有 5 个版本

[PDF] thecvf.com

Ponder: Point cloud pre-training via neural rendering

D Huang, S Peng, T He, H Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com

We propose a novel approach to self-supervised learning of point cloud representations by
differentiable neural rendering. Motivated by the fact that informative point cloud features …

被引用次数：26 相关文章所有 5 个版本

[PDF] arxiv.org

Firerisk: A remote sensing dataset for fire risk assessment with benchmarks using supervised and self-supervised learning

S Shen, S Seneviratne, X Wanyan… - … Conference on Digital …, 2023 - ieeexplore.ieee.org

In recent decades, wildfires have caused tremendous property losses, fatalities, and
extensive damage to forest ecosystems. Inspired by the abundance of publicly available …

被引用次数：243 相关文章所有 8 个版本

[PDF] thecvf.com

Disentangling spatial and temporal learning for efficient image-to-video transfer learning

Z Qing, S Zhang, Z Huang, Y Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recently, large-scale pre-trained language-image models like CLIP have shown
extraordinary capabilities for understanding spatial contents, but naively transferring such …

被引用次数：13 相关文章所有 5 个版本

[PDF] thecvf.com

Traj-mae: Masked autoencoders for trajectory prediction

H Chen, J Wang, K Shao, F Liu, J Hao… - Proceedings of the …, 2023 - openaccess.thecvf.com

Trajectory prediction has been a crucial task in building a reliable autonomous driving
system by anticipating possible dangers. One key issue is to generate consistent trajectory …

被引用次数：23 相关文章所有 6 个版本

[PDF] neurips.cc

Rethinking tokenizer and decoder in masked graph modeling for molecules

Z Liu, Y Shi, A Zhang, E Zhang… - Advances in …, 2024 - proceedings.neurips.cc

Masked graph modeling excels in the self-supervised representation learning of molecular
graphs. Scrutinizing previous studies, we can reveal a common scheme consisting of three …

被引用次数：9 相关文章所有 5 个版本

[PDF] mlr.press

Multi-robot scene completion: Towards task-agnostic collaborative perception

Y Li, J Zhang, D Ma, Y Wang… - Conference on Robot …, 2023 - proceedings.mlr.press

Collaborative perception learns how to share information among multiple robots to perceive
the environment better than individually done. Past research on this has been task-specific …

被引用次数：40 相关文章所有 5 个版本

[PDF] aaai.org

Switchtab: Switched autoencoders are effective tabular learners

J Wu, S Chen, Q Zhao, R Sergazinov, C Li… - Proceedings of the …, 2024 - ojs.aaai.org

Self-supervised representation learning methods have achieved significant success in
computer vision and natural language processing (NLP), where data samples exhibit explicit …

被引用次数：19 相关文章所有 5 个版本

高级搜索

QQ 群