Open x-embodiment: Robotic learning datasets and rt-x models

J Wang, Z Wu, Y Li, H Jiang, P Shu, E Shi, H Hu… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) have undergone significant expansion and have been
increasingly integrated across various domains. Notably, in the realm of robot task planning …

被引用次数：16 相关文章所有 3 个版本

[PDF] thecvf.com

Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision Language Audio and Action

J Lu, C Clark, S Lee, Z Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present Unified-IO 2 a multimodal and multi-skill unified model capable of following
novel instructions. Unified-IO 2 can use text images audio and/or videos as input and can …

被引用次数：37 相关文章所有 3 个版本

[PDF] arxiv.org

Mobile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation

Z Fu, TZ Zhao, C Finn - arXiv preprint arXiv:2401.02117, 2024 - arxiv.org

Imitation learning from human demonstrations has shown impressive performance in
robotics. However, most results focus on table-top manipulation, lacking the mobility and …

被引用次数：78 相关文章所有 3 个版本

[PDF] arxiv.org

Real-world robot applications of foundation models: A review

K Kawaharazuka, T Matsushima… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent developments in foundation models, like Large Language Models (LLMs) and Vision-
Language Models (VLMs), trained on extensive data, facilitate flexible application across …

被引用次数：11 相关文章所有 2 个版本

[PDF] arxiv.org

Octo: An open-source generalist robot policy

OM Team, D Ghosh, H Walke, K Pertsch… - arXiv preprint arXiv …, 2024 - arxiv.org

Large policies pretrained on diverse robot datasets have the potential to transform robotic
learning: instead of training new policies from scratch, such generalist robot policies may be …

被引用次数：40 相关文章所有 3 个版本

[PDF] arxiv.org

The foundation model transparency index

R Bommasani, K Klyman, S Longpre, S Kapoor… - arXiv preprint arXiv …, 2023 - arxiv.org

Foundation models have rapidly permeated society, catalyzing a wave of generative AI
applications spanning enterprise and consumer-facing contexts. While the societal impact of …

被引用次数：42 相关文章所有 2 个版本

[PDF] arxiv.org

Zero-shot robotic manipulation with pretrained image-editing diffusion models

K Black, M Nakamoto, P Atreya, H Walke… - arXiv preprint arXiv …, 2023 - arxiv.org

If generalist robots are to operate in truly unstructured environments, they need to be able to
recognize and reason about novel objects and scenarios. Such objects and scenarios might …

被引用次数：32 相关文章所有 4 个版本

[PDF] arxiv.org

Foundation models in robotics: Applications, challenges, and the future

R Firoozi, J Tucker, S Tian, A Majumdar, J Sun… - arXiv preprint arXiv …, 2023 - arxiv.org

We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …

被引用次数：42 相关文章所有 2 个版本

[PDF] arxiv.org

Vision-language foundation models as effective robot imitators

X Li, M Liu, H Zhang, C Yu, J Xu, H Wu… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent progress in vision language foundation models has shown their ability to understand
multimodal data and resolve complicated vision language tasks, including robotics …

被引用次数：29 相关文章所有 3 个版本

[PDF] arxiv.org

On bringing robots home

NMM Shafiullah, A Rai, H Etukuru, Y Liu, I Misra… - arXiv preprint arXiv …, 2023 - arxiv.org

Throughout history, we have successfully integrated various machines into our homes.
Dishwashers, laundry machines, stand mixers, and robot vacuums are a few recent …

被引用次数：20 相关文章

高级搜索

QQ 群