Vision-Language-Action (VLA) models have shown remarkable potential in visuomotor control and instruction comprehension through end-to-end learning processes. However …
Learned language-conditioned robot policies often struggle to effectively adapt to new real- world tasks even when pre-trained across a diverse set of instructions. We propose a novel …
Reinforcement learning (RL) is one of the active fields in machine learning, demonstrating remarkable potential in tackling real-world challenges. Despite its promising prospects, this …
Robots can use Visual Imitation Learning (VIL) to learn everyday tasks from video demonstrations. However, translating visual observations into actionable robot policies is …
RK Jenamani, P Sundaresan, M Sakr… - arXiv preprint arXiv …, 2024 - arxiv.org
Robot-assisted feeding has the potential to improve the quality of life for individuals with mobility limitations who are unable to feed themselves independently. However, there exists …
This paper presents Words2Contact, a language-guided multi-contact placement pipeline leveraging large language models and vision language models. Our method is a key …
In recent years roboticists have achieved remarkable progress in solving increasingly general tasks on dexterous robotic hardware by leveraging high capacity Transformer …
The ability to reflect on and correct failures is crucial for robotic systems to interact stably with real-life objects. Observing the generalization and reasoning capabilities of Multimodal …
A Lee, I Chuang, LY Chen, I Soltani - arXiv preprint arXiv:2409.07914, 2024 - arxiv.org
Bimanual manipulation presents unique challenges compared to unimanual tasks due to the complexity of coordinating two robotic arms. In this paper, we introduce InterACT: Inter …