Video generation has witnessed significant advancements yet evaluating these models remains a challenge. A comprehensive evaluation benchmark for video generation is …
H Chen, Y Zhang, X Cun, M Xia… - Proceedings of the …, 2024 - openaccess.thecvf.com
Text-to-video generation aims to produce a video based on a given prompt. Recently several commercial video models have been able to generate plausible videos with minimal …
This paper reviews the AIS 2024 Video Quality Assessment (VQA) Challenge focused on User-Generated Content (UGC). The aim of this challenge is to gather deep learning-based …
Y Liu, X Cun, X Liu, X Wang, Y Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
The vision and language generative models have been overgrown in recent years. For video generation various open-sourced models and public-available services have been …
Multi-modality large language models (MLLMs) as represented by GPT-4V have introduced a paradigm shift for visual perception and understanding tasks that a variety of abilities can …
The rapid evolution of Multi-modality Large Language Models (MLLMs) has catalyzed a shift in computer vision from specialized models to general-purpose foundation models …
Text-based diffusion models have exhibited remarkable success in generation and editing showing great promise for enhancing visual content with their generative prior. However …
The explosion of visual content available online underscores the requirement for an accurate machine assessor to robustly evaluate scores across diverse types of visual …
S Aneja, J Thies, A Dai… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
We introduce FaceTalk a novel generative approach designed for synthesizing high-fidelity 3D motion sequences of talking human heads from input audio signal. To capture the …