Copa: General robotic manipulation through spatial constraints of parts with foundation models

H Huang, F Lin, Y Hu, S Wang, Y Gao - arXiv preprint arXiv:2403.08248, 2024 - arxiv.org
Foundation models pre-trained on web-scale data are shown to encapsulate extensive
world knowledge beneficial for robotic manipulation in the form of task planning. However …

Pave the way to grasp anything: Transferring foundation models for universal pick-place robots

J Yang, W Tan, C Jin, B Liu, J Fu, R Song… - arXiv preprint arXiv …, 2023 - arxiv.org
Improving the generalization capabilities of general-purpose robotic agents has long been a
significant challenge actively pursued by research communities. Existing approaches often …

Physically grounded vision-language models for robotic manipulation

J Gao, B Sarkar, F Xia, T Xiao, J Wu, B Ichter… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent advances in vision-language models (VLMs) have led to improved performance on
tasks such as visual question answering and image captioning. Consequently, these models …

Alphablock: Embodied finetuning for vision-language reasoning in robot manipulation

C Jin, W Tan, J Yang, B Liu, R Song, L Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
We propose a novel framework for learning high-level cognitive capabilities in robot
manipulation tasks, such as making a smiley face using building blocks. These tasks often …

Deep compositional robotic planners that follow natural language commands

YL Kuo, B Katz, A Barbu - 2020 IEEE international conference …, 2020 - ieeexplore.ieee.org
We demonstrate how a sampling-based robotic planner can be augmented to learn to
understand a sequence of natural language commands in a continuous configuration space …

Geometry-based grasping pipeline for bi-modal pick and place

R Haschke, G Walck, H Ritter - 2021 IEEE/RSJ International …, 2021 - ieeexplore.ieee.org
We propose an autonomous grasping pipeline that relies on geometric information extracted
from segmented point cloud data. This is in contrast to many recent approaches leveraging …

Robot task planning and situation handling in open worlds

Y Ding, X Zhang, S Amiri, N Cao, H Yang… - arXiv preprint arXiv …, 2022 - arxiv.org
Automated task planning algorithms have been developed to help robots complete complex
tasks that require multiple actions. Most of those algorithms have been developed for" …

Planning with spatial-temporal abstraction from point clouds for deformable object manipulation

X Lin, C Qi, Y Zhang, Z Huang, K Fragkiadaki… - arXiv preprint arXiv …, 2022 - arxiv.org
Effective planning of long-horizon deformable object manipulation requires suitable
abstractions at both the spatial and temporal levels. Previous methods typically either focus …

A long horizon planning framework for manipulating rigid pointcloud objects

A Simeonov, Y Du, B Kim, F Hogan… - … on Robot Learning, 2021 - proceedings.mlr.press
We present a framework for solving long-horizon planning problems involving manipulation
of rigid objects that operates directly from a point-cloud observation. Our method plans in the …

Hierarchical planning for long-horizon manipulation with geometric and symbolic scene graphs

Y Zhu, J Tremblay, S Birchfield… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
We present a visually grounded hierarchical planning algorithm for long-horizon
manipulation tasks. Our algorithm offers a joint framework of neuro-symbolic task planning …