This survey presents an in-depth exploration of knowledge distillation (KD) techniques within the realm of Large Language Models (LLMs), spotlighting the pivotal role of KD in …
We introduce a pipeline that enhances a general-purpose Vision Language Model, GPT-4V (ision), by integrating observations of human actions to facilitate robotic manipulation. This …
Multilingual Large Language Models are capable of using powerful Large Language Models to handle and respond to queries in multiple languages, which achieves remarkable …
L Zhong, Z Wang, J Shang - arXiv preprint arXiv:2402.16906, 2024 - arxiv.org
Large language models (LLMs) are leading significant progress in code generation. Beyond one-pass code generation, recent works further integrate unit tests and program verifiers into …
L Qin, Q Chen, X Feng, Y Wu, Y Zhang, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org
While large language models (LLMs) like ChatGPT have shown impressive capabilities in Natural Language Processing (NLP) tasks, a systematic investigation of their potential in this …
Robotic behavior synthesis, the problem of understanding multimodal inputs and generating precise physical control for robots, is an important part of Embodied AI. Despite successes in …
Multi-modal Chain-of-Thought (MCoT) requires models to leverage knowledge from both textual and visual modalities for step-by-step reasoning, which gains increasing attention …
Y Zeng, Y Mu, L Shao - arXiv preprint arXiv:2405.07162, 2024 - arxiv.org
Learning reward functions remains the bottleneck to equip a robot with a broad repertoire of skills. Large Language Models (LLM) contain valuable task-related knowledge that can …
Y Zhang, S Yang, C Bai, F Wu, X Li, X Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Grounding the reasoning ability of large language models (LLMs) for embodied tasks is challenging due to the complexity of the physical world. Especially, LLM planning for multi …