A holistic understanding of object properties across diverse sensory modalities (eg, visual, audio, and haptic) is essential for tasks ranging from object categorization to complex …
Humans learn about objects via interaction and using multiple perceptions, such as vision, sound, and touch. While vision can provide information about an object's appearance, non …
There has been a lot of interest in grounding natural language to physical entities through visual context. While Vision Language Models (VLMs) can ground linguistic instructions to …
We address the task of long-horizon navigation in partially mapped environments for which active gathering of information about faraway unseen space is essential for good behavior …
For modern robots that are equipped with a set of skills, such as manipulation and navigation, it's crucial that they can autonomously make decisions to complete tasks over …