StructDiffusion: Language-guided creation of physically-valid structures using unseen objects

W Liu, Y Du, T Hermans, S Chernova… - arXiv preprint arXiv …, 2022 - arxiv.org
Robots operating in human environments must be able to rearrange objects into
semantically-meaningful configurations, even if these objects are previously unseen. In this …

Structdiffusion: Object-centric diffusion for semantic rearrangement of novel objects

W Liu, T Hermans, S Chernova… - Workshop on Language …, 2022 - openreview.net
Robots operating in human environments must be able to rearrange objects into
semantically-meaningful configurations, even if these objects are previously unseen. In this …

Leveraging commonsense knowledge from large language models for task and motion planning

Y Ding, X Zhang, C Paxton, S Zhang - RSS 2023 Workshop on …, 2023 - openreview.net
Multi-object rearrangement is a crucial skill for service robots, and commonsense reasoning
is frequently needed in this process. However, achieving commonsense arrangements …

Structformer: Learning spatial structure for language-guided semantic rearrangement of novel objects

W Liu, C Paxton, T Hermans… - … Conference on Robotics …, 2022 - ieeexplore.ieee.org
Geometric organization of objects into semantically meaningful arrangements pervades the
built world. As such, assistive robots operating in warehouses, offices, and homes would …

Latent space planning for multi-object manipulation with environment-aware relational classifiers

Y Huang, NC Taylor, A Conkey, W Liu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Objects rarely sit in isolation in everyday human environments. If we want robots to operate
and perform tasks in our human environments, they must understand how the objects they …

Dream2Real: Zero-shot 3D object rearrangement with vision-language models

I Kapelyukh, Y Ren, I Alzugaray… - First Workshop on Vision …, 2024 - openreview.net
We introduce Dream2Real, a robotics framework which integrates vision-language models
(VLMs) trained on 2D data into a 3D object rearrangement pipeline. This is achieved by the …

Predicting stable configurations for semantic placement of novel objects

C Paxton, C Xie, T Hermans… - Conference on robot …, 2022 - proceedings.mlr.press
Human environments contain numerous objects configured in a variety of arrangements.
Our goal is to enable robots to repose previously unseen objects according to learned …

Physically grounded vision-language models for robotic manipulation

J Gao, B Sarkar, F Xia, T Xiao, J Wu, B Ichter… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent advances in vision-language models (VLMs) have led to improved performance on
tasks such as visual question answering and image captioning. Consequently, these models …

Conceptgraphs: Open-vocabulary 3d scene graphs for perception and planning

Q Gu, A Kuwajerwala, S Morin… - arXiv preprint arXiv …, 2023 - arxiv.org
For robots to perform a wide variety of tasks, they require a 3D representation of the world
that is semantically rich, yet compact and efficient for task-driven perception and planning …

Task and motion planning with large language models for object rearrangement

Y Ding, X Zhang, C Paxton… - 2023 IEEE/RSJ …, 2023 - ieeexplore.ieee.org
Multi-object rearrangement is a crucial skill for service robots, and commonsense reasoning
is frequently needed in this process. However, achieving commonsense arrangements …