作者
Vidit Kumar, Vikas Tripathi, Bhaskar Pant
发表日期
2021/8/20
图书
Machine Learning, Advances in Computing, Renewable Energy and Communication: Proceedings of MARC 2020
页码范围
519-531
出版商
Springer Singapore
简介
Websites like YouTube, Facebook, Twitter, etc. encounter large amounts of videos every day, mostly uploaded from mobile devices, digital cameras, etc. These videos rarely have metadata (semantic tags) attached, without which it is very difficult to retrieve similar videos without using content-based search techniques. More recently, two-dimensional convolutional networks (2d-CNN) have shown breakthrough performance over hand-engineered methods on image-related tasks in all aspects of computer vision field. The video is also composed of 2D frames arranged along time dimension, which can also be processed by 2d-CNN. In this paper, we investigate the significance of activations of CNN layers for video representation and analyzed its performance on the basis of nearest the neighbor search task, i.e. video retrieval. Three well-known CNN networks (AlexNet, GoogleNet and ResNet18) are exploited for …
引用总数
20212022202320243941
学术搜索中的文章
V Kumar, V Tripathi, B Pant - … Learning, Advances in Computing, Renewable Energy …, 2021