Unfolding the literature: A review of robotic cloth manipulation

A Longhini, Y Wang, I Garcia-Camacho… - Annual Review of …, 2024 - annualreviews.org
The realm of textiles spans clothing, households, healthcare, sports, and industrial
applications. The deformable nature of these objects poses unique challenges that prior …

Data scaling laws in imitation learning for robotic manipulation

F Lin, Y Hu, P Sheng, C Wen, J You, Y Gao - arXiv preprint arXiv …, 2024 - arxiv.org
Data scaling has revolutionized fields like natural language processing and computer vision,
providing models with remarkable generalization capabilities. In this paper, we investigate …

Learning to manipulate anywhere: A visual generalizable framework for reinforcement learning

Z Yuan, T Wei, S Cheng, G Zhang, Y Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
Can we endow visuomotor robots with generalization capabilities to operate in diverse open-
world scenarios? In this paper, we propose\textbf {Maniwhere}, a generalizable framework …

Green screen augmentation enables scene generalisation in robotic manipulation

E Teoh, S Patidar, X Ma, S James - arXiv preprint arXiv:2407.07868, 2024 - arxiv.org
Generalising vision-based manipulation policies to novel environments remains a
challenging area with limited exploration. Current practices involve collecting data in one …

Aha: A vision-language-model for detecting and reasoning over failures in robotic manipulation

J Duan, W Pumacay, N Kumar, YR Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Robotic manipulation in open-world settings requires not only task execution but also the
ability to detect and learn from failures. While recent advances in vision-language models …

Generative image as action models

M Shridhar, YL Lo, S James - arXiv preprint arXiv:2407.07875, 2024 - arxiv.org
Image-generation diffusion models have been fine-tuned to unlock new capabilities such as
image-editing and novel view synthesis. Can we similarly unlock image-generation models …

3d-mvp: 3d multiview pretraining for robotic manipulation

S Qian, K Mo, V Blukis, DF Fouhey, D Fox… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent works have shown that visual pretraining on egocentric datasets using masked
autoencoders (MAE) can improve generalization for downstream robotics tasks. However …

Recasting Generic Pretrained Vision Transformers As Object-Centric Scene Encoders For Manipulation Policies

J Qian, A Panagopoulos, D Jayaraman - arXiv preprint arXiv:2405.15916, 2024 - arxiv.org
Generic re-usable pre-trained image representation encoders have become a standard
component of methods for many computer vision tasks. As visual representations for robots …

Position: scaling simulation is neither necessary nor sufficient for in-the-wild robot manipulation

H Bharadhwaj - Forty-first International Conference on Machine …, 2024 - openreview.net
In this paper, we develop a structured critique of robotic simulations for real-world
manipulation, by arguing that scaling simulators is neither necessary nor sufficient for …

Investigating the role of instruction variety and task difficulty in robotic manipulation tasks

A Parekh, N Vitsakis, A Suglia, I Konstas - arXiv preprint arXiv:2407.03967, 2024 - arxiv.org
Evaluating the generalisation capabilities of multimodal models based solely on their
performance on out-of-distribution data fails to capture their true robustness. This work …