Generative ai for self-adaptive systems: State of the art and research roadmap

J Li, M Zhang, N Li, D Weyns, Z Jin, K Tei - ACM Transactions on …, 2024 - dl.acm.org
Self-adaptive systems (SASs) are designed to handle changes and uncertainties through a
feedback loop with four core functionalities: monitoring, analyzing, planning, and execution …

[HTML][HTML] A survey of robot intelligence with large language models

H Jeong, H Lee, C Kim, S Shin - Applied Sciences, 2024 - mdpi.com
Since the emergence of ChatGPT, research on large language models (LLMs) has actively
progressed across various fields. LLMs, pre-trained on vast text datasets, have exhibited …

Gpt-4v (ision) for robotics: Multimodal task planning from human demonstration

N Wake, A Kanehira, K Sasabuchi… - IEEE Robotics and …, 2024 - ieeexplore.ieee.org
We introduce a pipeline that enhances a general-purpose Vision Language Model, GPT-4V
(ision), to facilitate one-shot visual teaching for robotic manipulation. This system analyzes …

Diffusiondepth: Diffusion denoising approach for monocular depth estimation

Y Duan, X Guo, Z Zhu - European Conference on Computer Vision, 2024 - Springer
Monocular depth estimation is a challenging task that predicts the pixel-wise depth from a
single 2D image. Current methods typically model this problem as a regression or …

An interactive agent foundation model

Z Durante, B Sarkar, R Gong, R Taori, Y Noda… - arXiv preprint arXiv …, 2024 - arxiv.org
The development of artificial intelligence systems is transitioning from creating static, task-
specific models to dynamic, agent-based systems capable of performing well in a wide …

LLM-empowered state representation for reinforcement learning

B Wang, Y Qu, Y Jiang, J Shao, C Liu, W Yang… - arXiv preprint arXiv …, 2024 - arxiv.org
Conventional state representations in reinforcement learning often omit critical task-related
details, presenting a significant challenge for value networks in establishing accurate …

Superpadl: Scaling language-directed physics-based control with progressive supervised distillation

J Juravsky, Y Guo, S Fidler, XB Peng - ACM SIGGRAPH 2024 …, 2024 - dl.acm.org
Physically-simulated models for human motion can generate high-quality responsive
character animations, often in real-time. Natural language serves as a flexible interface for …

Generating Physically Realistic and Directable Human Motions from Multi-modal Inputs

A Shrestha, P Liu, G Ros, K Yuan, A Fern - European Conference on …, 2024 - Springer
This work focuses on generating realistic, physically-based human behaviors from multi-
modal inputs, which may only partially specify the desired motion. For example, the input …

Humanoid Locomotion and Manipulation: Current Progress and Challenges in Control, Planning, and Learning

Z Gu, J Li, W Shen, W Yu, Z Xie, S McCrory… - arXiv preprint arXiv …, 2025 - arxiv.org
Humanoid robots have great potential to perform various human-level skills. These skills
involve locomotion, manipulation, and cognitive capabilities. Driven by advances in machine …

Integrating reinforcement learning with foundation models for autonomous robotics: Methods and perspectives

A Moroncelli, V Soni, AA Shahid, M Maccarini… - arXiv preprint arXiv …, 2024 - arxiv.org
Foundation models (FMs), large deep learning models pre-trained on vast, unlabeled
datasets, exhibit powerful capabilities in understanding complex patterns and generating …