NTIRE 2024 challenge on short-form UGC video quality assessment: Methods and results

X Li, K Yuan, Y Pei, Y Lu, M Sun… - Proceedings of the …, 2024 - openaccess.thecvf.com
This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality
Assessment (S-UGC VQA) where various excellent solutions are submitted and evaluated …

Recommender systems leveraging multimedia content

Y Deldjoo, M Schedl, P Cremonesi, G Pasi - ACM Computing Surveys …, 2020 - dl.acm.org
Recommender systems have become a popular and effective means to manage the ever-
increasing amount of multimedia content available today and to help users discover …

[PDF][PDF] Hierarchical text-conditional image generation with clip latents

A Ramesh, P Dhariwal, A Nichol, C Chu… - arXiv preprint arXiv …, 2022 - 3dvar.com
Contrastive models like CLIP have been shown to learn robust representations of images
that capture both semantics and style. To leverage these representations for image …

Maxvit: Multi-axis vision transformer

Z Tu, H Talebi, H Zhang, F Yang, P Milanfar… - European conference on …, 2022 - Springer
Transformers have recently gained significant attention in the computer vision community.
However, the lack of scalability of self-attention mechanisms with respect to image size has …

Sequential modeling enables scalable learning for large vision models

Y Bai, X Geng, K Mangalam, A Bar… - Proceedings of the …, 2024 - openaccess.thecvf.com
We introduce a novel sequential modeling approach which enables learning a Large Vision
Model (LVM) without making use of any linguistic data. To do this we define a common …

Exploring clip for assessing the look and feel of images

J Wang, KCK Chan, CC Loy - Proceedings of the AAAI Conference on …, 2023 - ojs.aaai.org
Measuring the perception of visual content is a long-standing problem in computer vision.
Many mathematical models have been developed to evaluate the look or quality of an …

Optimizing prompts for text-to-image generation

Y Hao, Z Chi, L Dong, F Wei - Advances in Neural …, 2024 - proceedings.neurips.cc
Well-designed prompts can guide text-to-image models to generate amazing images.
However, the performant prompts are often model-specific and misaligned with user input …

Imagen editor and editbench: Advancing and evaluating text-guided image inpainting

S Wang, C Saharia, C Montgomery… - Proceedings of the …, 2023 - openaccess.thecvf.com
Text-guided image editing can have a transformative impact in supporting creative
applications. A key challenge is to generate edits that are faithful to the input text prompt …

Musiq: Multi-scale image quality transformer

J Ke, Q Wang, Y Wang, P Milanfar… - Proceedings of the …, 2021 - openaccess.thecvf.com
Image quality assessment (IQA) is an important research topic for understanding and
improving visual experience. The current state-of-the-art IQA methods are based on …

Rewarded soups: towards pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

A Rame, G Couairon, C Dancette… - Advances in …, 2024 - proceedings.neurips.cc
Foundation models are first pre-trained on vast unsupervised datasets and then fine-tuned
on labeled data. Reinforcement learning, notably from human feedback (RLHF), can further …