- 学术资源搜索

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy

X Li, C Mata, J Park, K Kahatapitiya, YS Jang… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Language Models (LLMs) equipped with extensive world knowledge and strong
reasoning skills can tackle diverse tasks across domains, often by posing them as …

相关文章所有 2 个版本

[PDF] arxiv.org

Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations

P Li, T Liu, Y Li, M Han, H Geng, S Wang, Y Zhu… - arXiv preprint arXiv …, 2024 - arxiv.org

Autonomous robotic systems capable of learning novel manipulation tasks are poised to
transform industries from manufacturing to service automation. However, modern methods …

Towards Generalist Robot Learning from Internet Video: A Survey

R McCarthy, DCH Tan, D Schmidt, F Acero… - arXiv preprint arXiv …, 2024 - arxiv.org

This survey presents an overview of methods for learning from video (LfV) in the context of
reinforcement learning (RL) and robotics. We focus on methods capable of scaling to large …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control

Z Xiong, R Vuorio, J Beck, M Zimmer, K Shao… - arXiv preprint arXiv …, 2024 - arxiv.org

Learning a universal policy across different robot morphologies can significantly improve
learning efficiency and enable zero-shot generalization to unseen morphologies. However …

[HTML][HTML] A systematic review of major evaluation metrics for simulator-based automatic assessment of driving after stroke

P Taveekitworachai, G Chanmas, P Paliyawan… - Heliyon, 2024 - cell.com

Background: Simulator-based driving assessments (SA) have recently been used and
studied for various purposes, particularly for post-stroke patients. Automating such …

What Foundation Models can Bring for Robot Learning in Manipulation: A Survey

D Li, Y Jin, H Yu, J Shi, X Hao, P Hao, H Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

The realization of universal robots is an ultimate goal of researchers. However, a key hurdle
in achieving this goal lies in the robots' ability to manipulate objects in their unstructured …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Understanding Long Videos in One Multimodal Language Model Pass

K Ranasinghe, X Li, K Kahatapitiya… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Language Models (LLMs), known to contain a strong awareness of world knowledge,
have allowed recent approaches to achieve excellent performance on Long-Video …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

PhyGrasp: Generalizing Robotic Grasping with Physics-informed Large Multimodal Models

D Guo, Y Xiang, S Zhao, X Zhu, M Tomizuka… - arXiv preprint arXiv …, 2024 - arxiv.org

Robotic grasping is a fundamental aspect of robot functionality, defining how robots interact
with objects. Despite substantial progress, its generalizability to counter-intuitive or long …

相关文章所有 2 个版本

[PDF] arxiv.org

A Survey of Robotic Language Grounding: Tradeoffs Between Symbols and Embeddings

V Cohen, JX Liu, R Mooney, S Tellex… - arXiv preprint arXiv …, 2024 - arxiv.org

With large language models, robots can understand language more flexibly and more
capable than ever before. This survey reviews recent literature and situates it into a …

相关文章所有 2 个版本

[PDF] arxiv.org

Ego-Foresight: Agent Visuomotor Prediction as Regularization for RL

MS Nunes, A Dehban, Y Demiris… - arXiv preprint arXiv …, 2024 - arxiv.org

Despite the significant advancements in Deep Reinforcement Learning (RL) observed in the
last decade, the amount of training experience necessary to learn effective policies remains …

相关文章所有 2 个版本

高级搜索

QQ 群

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy

Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations

Towards Generalist Robot Learning from Internet Video: A Survey

Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control

[HTML][HTML] A systematic review of major evaluation metrics for simulator-based automatic assessment of driving after stroke

What Foundation Models can Bring for Robot Learning in Manipulation: A Survey

Understanding Long Videos in One Multimodal Language Model Pass

PhyGrasp: Generalizing Robotic Grasping with Physics-informed Large Multimodal Models

A Survey of Robotic Language Grounding: Tradeoffs Between Symbols and Embeddings

Ego-Foresight: Agent Visuomotor Prediction as Regularization for RL

引用