查看文章

An End-to-End Fast No-Reference Video Quality Predictor with Spatiotemporal Feature Fusion

作者

Anish Kumar Vishwakarma, Kishor M Bhurchandi

发表日期

2022/11/4

图书

International Conference on Computer Vision and Image Processing

页码范围

638-645

出版商

Springer Nature Switzerland

简介

This work proposes a reliable and efficient end-to-end No-Reference Video Quality Assessment (NR-VQA) model that fuses deep spatial and temporal features. Since both spatial (semantic) and temporal (motion) features have a significant impact on video quality, we have developed an effective and fast predictor of video quality by combining both. ResNet-50, a well-known pre-trained image classification model, is employed to extract semantic features from video frames, whereas I3D, a well-known pre-trained action recognition model, is used to compute spatiotemporal features from short video clips. Further, extracted features are passed through a regressor head that consists of a Gated Recurrent Unit (GRU) followed by a Fully Connected (FC) layer. Four popular and widely used authentic distortion databases LIVE-VQC, KoNViD-1k, LIVE-Qualcomm, and CVD2014, are utilized for validating the performance …

学术搜索中的文章

An End-to-End Fast No-Reference Video Quality Predictor with Spatiotemporal Feature Fusion

AK Vishwakarma, KM Bhurchandi - International Conference on Computer Vision and …, 2022