Attention based dual branches fingertip detection network and virtual key system

C Mou, X Zhang - Proceedings of the 28th ACM International …, 2020 - dl.acm.org
Proceedings of the 28th ACM International Conference on Multimedia, 2020dl.acm.org
Gesture and fingertip are becoming more and more important mediums for human-computer
interaction (HCI). Therefore, algorithms of gesture recognition and fingertip detection have
been extensively investigated. However, problems mainly remain in how to achieve a win-
win situation between speed and accuracy, and how to deal with complex interaction
environment. To rectify these problems, this paper proposes an attention-based dual
branches network that can efficiently fulfill both fingertip detection and gesture recognition …
Gesture and fingertip are becoming more and more important mediums for human-computer interaction (HCI). Therefore, algorithms of gesture recognition and fingertip detection have been extensively investigated. However, problems mainly remain in how to achieve a win-win situation between speed and accuracy, and how to deal with complex interaction environment. To rectify these problems, this paper proposes an attention-based dual branches network that can efficiently fulfill both fingertip detection and gesture recognition tasks. In order to deal with complex interaction environment, we combine both channel-wise attention and spatial-wise attention into the fingertip detection model. The extensive experiments demonstrate that our novel model is both effective and efficient. In the experiment, our proposed model achieves the average fingertip detection error at around 2.8 pixels in 640×480 video frame, and the average recognition accuracy among eight gestures reaches $99%$. Moreover, the average forward time is about 8 ms. Due to the light-weight design, this model can also achieve high-efficiency performance on CPU. In addition, we design a virtual key system based on our proposed model, which can allow users to complete the "clicking" operation naturally in virtual environment. Our proposed system can perform well with a single normal RGB camera without any pre-processing (e.g., image segmentation or contour extraction), which can significantly reduce the complexity of the interaction system.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果

Google学术搜索按钮

example.edu/paper.pdf
搜索
获取 PDF 文件
引用
References