Pointmamba: A simple state space model for point cloud analysis

D Liang, X Zhou, W Xu, X Zhu, Z Zou, X Ye… - arXiv preprint arXiv …, 2024 - arxiv.org
Transformers have become one of the foundational architectures in point cloud analysis
tasks due to their excellent global modeling ability. However, the attention mechanism has …

Mamba in vision: A comprehensive survey of techniques and applications

MM Rahman, AA Tutul, A Nath, L Laishram… - arXiv preprint arXiv …, 2024 - arxiv.org
Mamba is emerging as a novel approach to overcome the challenges faced by
Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) in computer vision …

Transformers to ssms: Distilling quadratic knowledge to subquadratic models

A Bick, KY Li, EP Xing, JZ Kolter, A Gu - arXiv preprint arXiv:2408.10189, 2024 - arxiv.org
Transformer architectures have become a dominant paradigm for domains like language
modeling but suffer in many inference settings due to their quadratic-time self-attention …

Progressive Classifier and Feature Extractor Adaptation for Unsupervised Domain Adaptation on Point Clouds

Z Wang, Z Zhao, Y Wu, L Zhou, D Xu - European Conference on Computer …, 2025 - Springer
Unsupervised domain adaptation (UDA) is a critical challenge in the field of point cloud
analysis. Previous works tackle the problem either by feature extractor adaptation to enable …

Voxel Mamba: Group-Free State Space Models for Point Cloud based 3D Object Detection

G Zhang, L Fan, C He, Z Lei, Z Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Serialization-based methods, which serialize the 3D voxels and group them into multiple
sequences before inputting to Transformers, have demonstrated their effectiveness in 3D …

GraspMamba: A Mamba-based Language-driven Grasp Detection Framework with Hierarchical Feature Learning

HH Nguyen, A Vuong, A Nguyen, I Reid… - arXiv preprint arXiv …, 2024 - arxiv.org
Grasp detection is a fundamental robotic task critical to the success of many industrial
applications. However, current language-driven models for this task often struggle with …

MambaPlace: Text-to-Point-Cloud Cross-Modal Place Recognition with Attention Mamba Mechanisms

T Shang, Z Li, W Pei, P Xu, ZJ Deng, F Kong - arXiv preprint arXiv …, 2024 - arxiv.org
Vision Language Place Recognition (VLVPR) enhances robot localization performance by
incorporating natural language descriptions from images. By utilizing language information …

Towards 3D Semantic Scene Completion for Autonomous Driving: A Meta-Learning Framework Empowered by Deformable Large-Kernel Attention and Mamba Model

Y Qu, Z Huang, Z Sheng, T Chen, S Chen - arXiv preprint arXiv …, 2024 - arxiv.org
Semantic scene completion (SSC) is essential for achieving comprehensive perception in
autonomous driving systems. However, existing SSC methods often overlook the high …

DSDFormer: An Innovative Transformer-Mamba Framework for Robust High-Precision Driver Distraction Identification

J Chen, Z Zhang, J Yu, H Huang, R Zhang, X Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
Driver distraction remains a leading cause of traffic accidents, posing a critical threat to road
safety globally. As intelligent transportation systems evolve, accurate and real-time …

MambaOcc: Visual State Space Model for BEV-based Occupancy Prediction with Local Adaptive Reordering

Y Tian, S Bai, Z Luo, Y Wang, Y Lv, FY Wang - arXiv preprint arXiv …, 2024 - arxiv.org
Occupancy prediction has attracted intensive attention and shown great superiority in the
development of autonomous driving systems. The fine-grained environmental …