Abstract We present Ego-Exo4D a diverse large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric …
Dynamic neural network is an emerging research topic in deep learning. Compared to static models which have fixed computational graphs and parameters at the inference stage …
Y Wang, Y Sun, Y Huang, Z Liu… - Proceedings of the …, 2022 - openaccess.thecvf.com
Current benchmarks for facial expression recognition (FER) mainly focus on static images, while there are limited datasets for FER in videos. It is still ambiguous to evaluate whether …
Abstract Empowered by Large Language Models (LLMs), recent advancements in Video- based LLMs (VideoLLMs) have driven progress in various video understanding tasks. These …
Adaptive optimization methods for deep learning adjust the inference task to the current circumstances at runtime to improve the resource footprint while maintaining the model's …
Z Fei, X Yan, S Wang, Q Tian - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Both accuracy and efficiency are crucial for image captioning in real-world scenarios. Although Transformer-based models have gained significant improved captioning …
H Guan, J Lin, RWH Lau - … of the IEEE/CVF Conference on …, 2022 - openaccess.thecvf.com
Mirrors generally lack a consistent visual appearance, making mirror detection very challenging. Although recent works that are based on exploiting contextual contrasts and …
Recent works have shown that the computational efficiency of video recognition can be significantly improved by reducing the spatial redundancy. As a representative work, the …
Recent advancements in deploying deep neural networks (DNNs) on resource-constrained devices have generated interest in input-adaptive dynamic neural networks (DyNNs) …