D Zhang, X Dai, X Wang, YF Wang… - Proceedings of the …, 2019 - openaccess.thecvf.com
This research strives for natural language moment retrieval in long, untrimmed video streams. The problem is not trivial especially when a video contains multiple moments of …
Action detection is an essential and challenging task, especially for densely labelled datasets of untrimmed videos. The temporal relation is complex in those datasets, including …
R Dai, S Das, F Bremond - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
In video understanding, most cross-modal knowledge distillation (KD) methods are tailored for classification tasks, focusing on the discriminative representation of the trimmed videos …
Handling long and complex temporal information is an important factor for action detection tasks. This challenge is further aggravated by densely distributed actions in untrimmed …
We present PAT, a transformer-based network that learns complex temporal co-occurrence action dependencies in a video by exploiting multi-scale temporal features. In existing …
Designing activity detection systems that can be successfully deployed in daily-living environments requires datasets that pose the challenges typical of real-world scenarios. In …
Y Ming, F Feng, C Li, JH Xue - Neurocomputing, 2021 - Elsevier
Video action recognition is a vital area of computer vision. By adding temporal dimension into convolution structure, 3D convolution neural network owns the capacity to extract spatio …
D Zhang, X Dai, YF Wang - Computer Vision–ACCV 2018: 14th Asian …, 2019 - Springer
Recognizing instances at varying scales simultaneously is a fundamental challenge in visual detection problems. While spatial multi-scale modeling has been well studied in …
Action detection is an essential and challenging task, especially for densely labelled datasets of untrimmed videos. There are many real-world challenges in those datasets, such …