Abstract 3D point cloud segmentation has made tremendous progress in recent years. Most current methods focus on aggregating local features, but fail to directly model long-range …
Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism. Thanks to its strong representation …
Capsule networks emerged as a promising alternative to convolutional neural networks for learning object-centric representations. The idea is to explicitly model part-whole hierarchies …
Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism. Thanks to its strong representation …
Q Chen, Q Wu, J Wang, Q Hu, T Hu… - Proceedings of the …, 2022 - openaccess.thecvf.com
While local-window self-attention performs notably in vision tasks, it suffers from limited receptive field and weak modeling capability issues. This is mainly because it performs self …
Recent advances in image editing techniques have posed serious challenges to the trustworthiness of multimedia data, which drives the research of image tampering detection …
Recently, transformer-based methods have dominated 3D instance segmentation, where mask attention is commonly involved. Specifically, object queries are guided by the initial …
JH Kim, B Heo, JS Lee - … of the IEEE/CVF Conference on …, 2022 - openaccess.thecvf.com
Recently, learned image compression methods have outperformed traditional hand-crafted ones including BPG. One of the keys to this success is learned entropy models that estimate …
S Chang, P Wang, M Lin, F Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
The quadratic computational complexity to the number of tokens limits the practical applications of Vision Transformers (ViTs). Several works propose to prune redundant …