Weakly supervised audio-visual violence detection- 学术资源搜索

Weakly supervised audio-visual violence detection

P Wu, X Liu, J Liu - IEEE Transactions on Multimedia, 2022 - ieeexplore.ieee.org

IEEE Transactions on Multimedia, 2022•ieeexplore.ieee.org

Violence detection in videos is very promising in practical applications due to the emergence of massive videos in recent years. Most previous works define violence detection as a simple video classification task and use the single modality of small-scale datasets, e.g., visual signal. However, such solutions are undersupplied. To mitigate this problem, we study weakly supervised violence detection on the large-scale audio-visual violence data, and first introduce two complementary tasks, i.e., coarse-grained violent frame detection and fine-grained violent event detection, to advance the simple violence video classification to frame-level violent event localization, which aims to accurately locate the violent events on untrimmed videos. We then propose a novel network that takes as input audio-visual data and contains three parallel branches to capture different relationships among video snippets and further integrate features, where similarity branch and proximity branch capture long-range dependencies using similarity prior and proximity prior, respectively, and score branch dynamically captures the closeness of predicted score. In both coarse-grained and fine-grained tasks, our approach outperforms other state-of-the-art approaches on two public datasets. Moreover, experiment results also show the positive effect of audio-visual input and relationship modeling.

ieeexplore.ieee.org

展开收起

被引用次数：47 相关文章所有 2 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

Weakly supervised audio-visual violence detection

引用