Representing robotic manipulation tasks as constraints that associate the robot and the environment is a promising way to encode desired robot behaviors. However, it remains …
To endow models with greater understanding of physics and motion, it is useful to enable them to perceive how solid surfaces move and deform in real scenes. This can be formalized …
We seek to learn a generalizable goal-conditioned policy that enables zero-shot robot manipulation—interacting with unseen objects in novel scenes without test-time adaptation …
We present Im2Flow2Act, a scalable learning framework that enables robots to acquire real- world manipulation skills without the need of real-world robot training data. The key idea …
H Bharadhwaj, R Mottaghi, A Gupta… - European Conference on …, 2025 - Springer
We seek to learn a generalizable goal-conditioned policy that enables diverse robot manipulation—interacting with unseen objects in novel scenes without test-time adaptation …
How can robot manipulation policies generalize to novel tasks involving unseen object types and new motions? In this paper, we provide a solution in terms of predicting motion …
We introduce Latent Action Pretraining for general Action models (LAPA), an unsupervised method for pretraining Vision-Language-Action (VLA) models without ground-truth robot …
Humans can learn to manipulate new objects by simply watching others; providing robots with the ability to learn from such demonstrations would enable a natural interface specifying …
Both text and video data are abundant on the internet and support large-scale self- supervised learning through next token or frame prediction. However, they have not been …