Clio: Real-time Task-Driven Open-Set 3D Scene Graphs

D Maggio, Y Chang, N Hughes, M Trang… - arXiv preprint arXiv …, 2024 - arxiv.org
Modern tools for class-agnostic image segmentation (eg, SegmentAnything) and open-set
semantic understanding (eg, CLIP) provide unprecedented opportunities for robot …

What Foundation Models can Bring for Robot Learning in Manipulation: A Survey

D Li, Y Jin, H Yu, J Shi, X Hao, P Hao, H Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
The realization of universal robots is an ultimate goal of researchers. However, a key hurdle
in achieving this goal lies in the robots' ability to manipulate objects in their unstructured …

Embedding Pose Graph, Enabling 3D Foundation Model Capabilities with a Compact Representation

H Thomas, J Zhang - arXiv preprint arXiv:2403.13777, 2024 - arxiv.org
This paper presents the Embedding Pose Graph (EPG), an innovative method that combines
the strengths of foundation models with a simple 3D representation suitable for robotics …

Learning-based legged locomotion; state of the art and future perspectives

S Ha, J Lee, M van de Panne, Z Xie, W Yu… - arXiv preprint arXiv …, 2024 - arxiv.org
Legged locomotion holds the premise of universal mobility, a critical capability for many real-
world robotic applications. Both model-based and learning-based approaches have …

Uncertainty-Aware Deployment of Pre-trained Language-Conditioned Imitation Learning Policies

B Wu, BD Lee, K Daniilidis, B Bucher… - arXiv preprint arXiv …, 2024 - arxiv.org
Large-scale robotic policies trained on data from diverse tasks and robotic platforms hold
great promise for enabling general-purpose robots; however, reliable generalization to new …

CLIPSwarm: Generating Drone Shows from Text Prompts with Vision-Language Models

P Pueyo, E Montijano, AC Murillo… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper introduces CLIPSwarm, a new algorithm designed to automate the modeling of
swarm drone formations based on natural language. The algorithm begins by enriching a …

Knowledge Transfer for Cross-Domain Reinforcement Learning: A Systematic Review

SA Serrano, J Martinez-Carranza, LE Sucar - arXiv preprint arXiv …, 2024 - arxiv.org
Reinforcement Learning (RL) provides a framework in which agents can be trained, via trial
and error, to solve complex decision-making problems. Learning with little supervision …

Click to Grasp: Zero-Shot Precise Manipulation via Visual Diffusion Descriptors

N Tsagkas, J Rome, S Ramamoorthy… - arXiv preprint arXiv …, 2024 - arxiv.org
Precise manipulation that is generalizable across scenes and objects remains a persistent
challenge in robotics. Current approaches for this task heavily depend on having a …

TeleMoMa: A Modular and Versatile Teleoperation System for Mobile Manipulation

S Dass, W Ai, Y Jiang, S Singh, J Hu, R Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
A critical bottleneck limiting imitation learning in robotics is the lack of data. This problem is
more severe in mobile manipulation, where collecting demonstrations is harder than in …

Verifiably Following Complex Robot Instructions with Foundation Models

B Quartey, E Rosen, S Tellex, G Konidaris - arXiv preprint arXiv …, 2024 - arxiv.org
Enabling robots to follow complex natural language instructions is an important yet
challenging problem. People want to flexibly express constraints, refer to arbitrary landmarks …