Gr-2: A generative video-language-action model with web-scale knowledge for robot manipulation

CL Cheang, G Chen, Y Jing, T Kong, H Li, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org
We present GR-2, a state-of-the-art generalist robot agent for versatile and generalizable
robot manipulation. GR-2 is first pre-trained on a vast number of Internet videos to capture …

DFields: Dynamic 3D Descriptor Fields for Zero-Shot Generalizable Robotic Manipulation

Y Wang, M Zhang, Z Li… - ICRA 2024 Workshop …, 2023 - openreview.net
Scene representation has been a crucial design choice in robotic manipulation systems. An
ideal representation should be 3D, dynamic, and semantic to meet the demands of diverse …

Deep generative models in robotics: A survey on learning from multimodal demonstrations

J Urain, A Mandlekar, Y Du, M Shafiullah, D Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
Learning from Demonstrations, the field that proposes to learn robot behavior models from
data, is gaining popularity with the emergence of deep generative models. Although the …

3d diffusion policy

Y Ze, G Zhang, K Zhang, C Hu, M Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Imitation learning provides an efficient way to teach robots dexterous skills; however,
learning complex skills robustly and generalizablely usually consumes large amounts of …

Continuous control with coarse-to-fine reinforcement learning

Y Seo, J Uruç, S James - arXiv preprint arXiv:2407.07787, 2024 - arxiv.org
Despite recent advances in improving the sample-efficiency of reinforcement learning (RL)
algorithms, designing an RL algorithm that can be practically deployed in real-world …

Generalizable humanoid manipulation with improved 3d diffusion policies

Y Ze, Z Chen, W Wang, T Chen, X He, Y Yuan… - arXiv preprint arXiv …, 2024 - arxiv.org
Humanoid robots capable of autonomous operation in diverse environments have long
been a goal for roboticists. However, autonomous manipulation by humanoid robots has …

Robouniview: Visual-language model with unified view representation for robotic manipulation

F Liu, F Yan, L Zheng, C Feng, Y Huang… - arXiv preprint arXiv …, 2024 - arxiv.org
Utilizing Vision-Language Models (VLMs) for robotic manipulation represents a novel
paradigm, aiming to enhance the model's ability to generalize to new objects and …

DFields: Dynamic 3D Descriptor Fields for Zero-Shot Generalizable Rearrangement

Y Wang, M Zhang, Z Li, T Kelestemur… - … Conference on Robot …, 2024 - openreview.net
Scene representation is a crucial design choice in robotic manipulation systems. An ideal
representation is expected to be 3D, dynamic, and semantic to meet the demands of diverse …

Closed-loop visuomotor control with generative expectation for robotic manipulation

Q Bu, J Zeng, L Chen, Y Yang, G Zhou, J Yan… - arXiv preprint arXiv …, 2024 - arxiv.org
Despite significant progress in robotics and embodied AI in recent years, deploying robots
for long-horizon tasks remains a great challenge. Majority of prior arts adhere to an open …

The art of imitation: Learning long-horizon manipulation tasks from few demonstrations

JO von Hartz, T Welschehold, A Valada… - IEEE Robotics and …, 2024 - ieeexplore.ieee.org
Task Parametrized Gaussian Mixture Models (TP-GMM) are a sample-efficient method for
learning object-centric robot manipulation tasks. However, there are several open …