Core challenges in embodied vision-language planning

J Francis, N Kitamura, F Labelle, X Lu, I Navarro… - Journal of Artificial …, 2022 - jair.org
Recent advances in the areas of multimodal machine learning and artificial intelligence (AI)
have led to the development of challenging tasks at the intersection of Computer Vision …

RILA: Reflective and Imaginative Language Agent for Zero-Shot Semantic Audio-Visual Navigation

Z Yang, J Liu, P Chen, A Cherian… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract We leverage Large Language Models (LLM) for zeroshot Semantic Audio Visual
Navigation (SAVN). Existing methods utilize extensive training demonstrations for …

Multi-goal audio-visual navigation using sound direction map

H Kondoh, A Kanezaki - 2023 IEEE/RSJ International …, 2023 - ieeexplore.ieee.org
Over the past few years, there has been a great deal of research on navigation tasks in
indoor environments using deep reinforcement learning agents. Most of these tasks use only …

Learning semantic-agnostic and spatial-aware representation for generalizable visual-audio navigation

H Wang, Y Wang, F Zhong, M Wu… - IEEE Robotics and …, 2023 - ieeexplore.ieee.org
Visual-audio navigation (VAN) is attracting more and more attention from the robotic
community due to its broad applications, eg, household robots and rescue robots. In this …

Human-Centric AI with Common Sense

F Ilievski - Springer
Like never before and well beyond the imagination of most of us, artificial intelligence (AI)
software is increasingly becoming commonplace. Seemingly overnight, AI models that …