Foundation Models Defining a New Era in Vision: a Survey and Outlook

M Awais, M Naseer, S Khan, RM Anwer… - … on Pattern Analysis …, 2025 - ieeexplore.ieee.org
Vision systems that see and reason about the compositional nature of visual scenes are
fundamental to understanding our world. The complex relations between objects and their …

[HTML][HTML] Large language models for human-robot interaction: A review

C Zhang, J Chen, J Li, Y Peng, Z Mao - Biomimetic Intelligence and …, 2023 - Elsevier
The fusion of large language models and robotic systems has introduced a transformative
paradigm in human–robot interaction, offering unparalleled capabilities in natural language …

Llm+ p: Empowering large language models with optimal planning proficiency

B Liu, Y Jiang, X Zhang, Q Liu, S Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated remarkable zero-shot generalization
abilities: state-of-the-art chatbots can provide plausible answers to many common questions …

Foundation models in robotics: Applications, challenges, and the future

R Firoozi, J Tucker, S Tian… - … Journal of Robotics …, 2023 - journals.sagepub.com
We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …

Cognitive architectures for language agents

TR Sumers, S Yao, K Narasimhan… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent efforts have incorporated large language models (LLMs) with external resources (eg,
the Internet) or internal control flows (eg, prompt chaining) for tasks requiring grounding or …

Toward general-purpose robots via foundation models: A survey and meta-analysis

Y Hu, Q Xie, V Jain, J Francis, J Patrikar… - arXiv preprint arXiv …, 2023 - arxiv.org
Building general-purpose robots that operate seamlessly in any environment, with any
object, and utilizing various skills to complete diverse tasks has been a long-standing goal in …

Real-world robot applications of foundation models: A review

K Kawaharazuka, T Matsushima… - Advanced …, 2024 - Taylor & Francis
Recent developments in foundation models, like Large Language Models (LLMs) and Vision-
Language Models (VLMs), trained on extensive data, facilitate flexible application across …

Look before you leap: Unveiling the power of gpt-4v in robotic vision-language planning

Y Hu, F Lin, T Zhang, L Yi, Y Gao - arXiv preprint arXiv:2311.17842, 2023 - arxiv.org
In this study, we are interested in imbuing robots with the capability of physically-grounded
task planning. Recent advancements have shown that large language models (LLMs) …

Distilling and retrieving generalizable knowledge for robot manipulation via language corrections

L Zha, Y Cui, LH Lin, M Kwon… - … on Robotics and …, 2024 - ieeexplore.ieee.org
Today's robot policies exhibit subpar performance when faced with the challenge of
generalizing to novel environments. Human corrective feedback is a crucial form of …

Semantic anomaly detection with large language models

A Elhafsi, R Sinha, C Agia, E Schmerling… - Autonomous …, 2023 - Springer
As robots acquire increasingly sophisticated skills and see increasingly complex and varied
environments, the threat of an edge case or anomalous failure is ever present. For example …