Chat-univi: Unified visual representation empowers large language models with image and video understanding

P Jin, R Takanobu, W Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Large language models have demonstrated impressive universal capabilities across a wide
range of open-ended tasks and have extended their utility to encompass multimodal …

Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

P Jin, R Takanobu, C Zhang, X Cao, L Yuan - arXiv e-prints, 2023 - ui.adsabs.harvard.edu
Large language models have demonstrated impressive universal capabilities across a wide
range of open-ended tasks and have extended their utility to encompass multimodal …

[PDF][PDF] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

P Jin, RTCZX Cao, L Yuan - academia.edu
Large language models have demonstrated impressive universal capabilities across a wide
range of open-ended tasks and have extended their utility to encompass multimodal …

Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

P Jin, R Takanobu, C Zhang, X Cao, L Yuan - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models have demonstrated impressive universal capabilities across a wide
range of open-ended tasks and have extended their utility to encompass multimodal …