Spherical transformer for lidar-based 3d recognition

X Lai, Y Chen, F Lu, J Liu, J Jia - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
LiDAR-based 3D point cloud recognition has benefited various applications. Without
specially considering the LiDAR point distribution, most current methods suffer from …

Stratified transformer for 3d point cloud segmentation

X Lai, J Liu, L Jiang, L Wang, H Zhao… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract 3D point cloud segmentation has made tremendous progress in recent years. Most
current methods focus on aggregating local features, but fail to directly model long-range …

A survey on vision transformer

K Han, Y Wang, H Chen, X Chen, J Guo… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

Object-centric learning with capsule networks: A survey

F De Sousa Ribeiro, K Duarte, M Everett… - ACM Computing …, 2024 - dl.acm.org
Capsule networks emerged as a promising alternative to convolutional neural networks for
learning object-centric representations. The idea is to explicitly model part-whole hierarchies …

A survey on visual transformer

K Han, Y Wang, H Chen, X Chen, J Guo, Z Liu… - arXiv preprint arXiv …, 2020 - arxiv.org
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

Mixformer: Mixing features across windows and dimensions

Q Chen, Q Wu, J Wang, Q Hu, T Hu… - Proceedings of the …, 2022 - openaccess.thecvf.com
While local-window self-attention performs notably in vision tasks, it suffers from limited
receptive field and weak modeling capability issues. This is mainly because it performs self …

Objectformer for image manipulation detection and localization

J Wang, Z Wu, J Chen, X Han… - Proceedings of the …, 2022 - openaccess.thecvf.com
Recent advances in image editing techniques have posed serious challenges to the
trustworthiness of multimedia data, which drives the research of image tampering detection …

Mask-attention-free transformer for 3d instance segmentation

X Lai, Y Yuan, R Chu, Y Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recently, transformer-based methods have dominated 3D instance segmentation, where
mask attention is commonly involved. Specifically, object queries are guided by the initial …

Joint global and local hierarchical priors for learned image compression

JH Kim, B Heo, JS Lee - … of the IEEE/CVF Conference on …, 2022 - openaccess.thecvf.com
Recently, learned image compression methods have outperformed traditional hand-crafted
ones including BPG. One of the keys to this success is learned entropy models that estimate …

Making vision transformers efficient from a token sparsification view

S Chang, P Wang, M Lin, F Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
The quadratic computational complexity to the number of tokens limits the practical
applications of Vision Transformers (ViTs). Several works propose to prune redundant …