TEVAD: Improved video anomaly detection with captions

W Chen, KT Ma, ZJ Yew, M Hur… - Proceedings of the …, 2023 - openaccess.thecvf.com
Video surveillance systems are used to enhance the public safety and private assets.
Automatic anomaly detection is vital in such surveillance systems to reduce the human labor
and its associated costs. Previous works only consider spatial-temporal features. In many
complex real-world scenarios, such visual features are unable to capture the semantic
meanings required to further improve accuracy. To deal with such issues, we propose a
novel framework: Text Empowered Video Anomaly Detection (TEVAD) which utilizes both …

[PDF][PDF] TEVAD: Improved video anomaly detection with captions Supplementary Materials

W Chen, KT Ma, ZJ Yew, M Hur, DAA Khoo… - openaccess.thecvf.com
We extend Multi-scale Temporal Network (MTN)[8] to process text features to learn the long
and short range temporal dependencies between snippet text features. Pre-computed text
features Ftxt (ie sentence embeddings of the video snippets), where Ftxt∈ Rdtxt, are fed into
the three pyramid dilated convolution (PDC) layers respectively given as F (Pi)= fconv (Ftxt;
θ) for i∈{1, 2, 3}, where F (Pi)∈ Rdtxt/4 and fconv is a 1D convolution function. θ comprises
the weights for all convolution functions described in this section. The three feature vectors …
以上显示的是最相近的搜索结果。 查看全部搜索结果