Self-supervision through random segments with autoregressive coding (randsac)

S Li, L Zhang, Z Wang, D Wu, L Wu, Z Liu, J Xia… - arXiv preprint arXiv …, 2023 - arxiv.org

As the deep learning revolution marches on, self-supervised learning has garnered
increasing attention in recent years thanks to its remarkable representation learning ability …

被引用次数：9 相关文章所有 2 个版本

[PDF] neurips.cc

Obj2seq: Formatting objects as sequences with class prompt for visual tasks

Z Chen, Y Zhu, Z Li, F Yang, W Li… - Advances in …, 2022 - proceedings.neurips.cc

Visual tasks vary a lot in their output formats and concerned contents, therefore it is hard to
process them with an identical structure. One main obstacle lies in the high-dimensional …

被引用次数：19 相关文章所有 6 个版本

[PDF] openreview.net

Rejuvenating image-gpt as strong visual representation learners

S Ren, Z Wang, H Zhu, J Xiao, A Yuille… - Forty-first International …, 2023 - openreview.net

This paper enhances image-GPT (iGPT), one of the pioneering works that introduce
autoregressive pretraining to predict the next pixels for visual representation learning. Two …

被引用次数：4 相关文章所有 3 个版本

[PDF] aaai.org

Exploring stochastic autoregressive image modeling for visual representation

Y Qi, F Yang, Y Zhu, Y Liu, L Wu, R Zhao… - Proceedings of the AAAI …, 2023 - ojs.aaai.org

Autoregressive language modeling (ALM) has been successfully used in self-supervised pre-
training in Natural language processing (NLP). However, this paradigm has not achieved …

被引用次数：12 相关文章所有 4 个版本

[PDF] thecvf.com

Self-Supervised Representation Learning from Arbitrary Scenarios

Z Li, Y Zhu, Z Chen, Z Gao, R Zhao… - Proceedings of the …, 2024 - openaccess.thecvf.com

Current self-supervised methods can primarily be categorized into contrastive learning and
masked image modeling. Extensive studies have demonstrated that combining these two …

[PDF] arxiv.org

Efficient masked autoencoders with self-consistency

Z Li, Y Zhu, Z Chen, W Li, R Zhao… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Inspired by the masked language modeling (MLM) in natural language processing tasks, the
masked image modeling (MIM) has been recognized as a strong self-supervised pre …

被引用次数：4 相关文章所有 7 个版本

[PDF] aaai.org

Semantic-Aware Autoregressive Image Modeling for Visual Representation Learning

K Song, S Zhang, T Wang - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

The development of autoregressive modeling (AM) in computer vision lags behind natural
language processing (NLP) in self-supervised pre-training. This is mainly caused by the …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Look ahead or look around? a theoretical comparison between autoregressive and masked pretraining

Q Zhang, T Du, H Huang, Y Wang, Y Wang - arXiv preprint arXiv …, 2024 - arxiv.org

In recent years, the rise of generative self-supervised learning (SSL) paradigms has
exhibited impressive performance across visual, language, and multi-modal domains. While …

被引用次数：1 相关文章

[PDF] arxiv.org

Autoregressive Pretraining with Mamba in Vision

S Ren, X Li, H Tu, F Wang, F Shu, L Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

The vision community has started to build with the recently developed state space model,
Mamba, as the new backbone for a range of tasks. This paper shows that Mamba's visual …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning

S Ren, H Zhu, C Wei, Y Li, A Yuille, C Xie - arXiv preprint arXiv …, 2024 - arxiv.org

This paper presents a new self-supervised video representation learning framework,
ARVideo, which autoregressively predicts the next video token in a tailored sequence order …

被引用次数：1 相关文章所有 2 个版本

高级搜索

QQ 群