Automatic anomaly detection is vital in such surveillance systems to reduce the human labor
and its associated costs. Previous works only consider spatial-temporal features. In many
complex real-world scenarios, such visual features are unable to capture the semantic
meanings required to further improve accuracy. To deal with such issues, we propose a
novel framework: Text Empowered Video Anomaly Detection (TEVAD) which utilizes both …