Partslip: Low-shot part segmentation for 3d point clouds via pretrained image-language models

M Liu, Y Zhu, H Cai, S Han, Z Ling… - Proceedings of the …, 2023 - openaccess.thecvf.com
Generalizable 3D part segmentation is important but challenging in vision and robotics.
Training deep models via conventional supervised methods requires large-scale 3D …

Maniskill2: A unified benchmark for generalizable manipulation skills

J Gu, F Xiang, X Li, Z Ling, X Liu, T Mu, Y Tang… - arXiv preprint arXiv …, 2023 - arxiv.org
Generalizable manipulation skills, which can be composed to tackle long-horizon and
complex daily chores, are one of the cornerstones of Embodied AI. However, existing …

Toolflownet: Robotic manipulation with tools via predicting tool flow from point clouds

D Seita, Y Wang, SJ Shetty, EY Li… - … on Robot Learning, 2023 - proceedings.mlr.press
Point clouds are a widely available and canonical data modality which convey the 3D
geometry of a scene. Despite significant progress in classification and segmentation from …

Dexart: Benchmarking generalizable dexterous manipulation with articulated objects

C Bao, H Xu, Y Qin, X Wang - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
To enable general-purpose robots, we will require the robot to operate daily articulated
objects as humans do. Current robot manipulation has heavily relied on using a parallel …

SUGAR: Pre-training 3D Visual Representations for Robotics

S Chen, R Garcia, I Laptev… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Learning generalizable visual representations from Internet data has yielded promising
results for robotics. Yet prevailing approaches focus on pre-training 2D representations …

Polarnet: 3d point clouds for language-guided robotic manipulation

S Chen, R Garcia, C Schmid, I Laptev - arXiv preprint arXiv:2309.15596, 2023 - arxiv.org
The ability for robots to comprehend and execute manipulation tasks based on natural
language instructions is a long-term goal in robotics. The dominant approaches for …

ARNOLD: A benchmark for language-grounded task learning with continuous states in realistic 3D scenes

R Gong, J Huang, Y Zhao, H Geng… - Proceedings of the …, 2023 - openaccess.thecvf.com
Understanding the continuous states of objects is essential for task learning and planning in
the real world. However, most existing task learning benchmarks assume discrete (eg …

A universal semantic-geometric representation for robotic manipulation

T Zhang, Y Hu, H Cui, H Zhao, Y Gao - arXiv preprint arXiv:2306.10474, 2023 - arxiv.org
Robots rely heavily on sensors, especially RGB and depth cameras, to perceive and interact
with the world. RGB cameras record 2D images with rich semantic information while missing …

[PDF][PDF] Robot synesthesia: In-hand manipulation with visuotactile sensing

Y Yuan, H Che, Y Qin, B Huang, ZH Yin… - arXiv preprint arXiv …, 2023 - touchprocessing.org
Executing contact-rich manipulation tasks necessitates the fusion of tactile and visual
feedback. However, the distinct nature of these modalities poses significant challenges. In …

On the efficacy of 3d point cloud reinforcement learning

Z Ling, Y Yao, X Li, H Su - arXiv preprint arXiv:2306.06799, 2023 - arxiv.org
Recent studies on visual reinforcement learning (visual RL) have explored the use of 3D
visual representations. However, none of these work has systematically compared the …