作者
Natalia Neverova, Christian Wolf, Graham W Taylor, Florian Nebout
发表日期
2015
研讨会论文
Computer Vision-ECCV 2014 Workshops: Zurich, Switzerland, September 6-7 and 12, 2014, Proceedings, Part I 13
页码范围
474-490
出版商
Springer International Publishing
简介
We present a method for gesture detection and localization based on multi-scale and multi-modal deep learning. Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at two temporal scales. Key to our technique is a training strategy which exploits i) careful initialization of individual modalities; and ii) gradual fusion of modalities from strongest to weakest cross-modality structure. We present experiments on the ChaLearn 2014 Looking at People Challenge gesture recognition track, in which we placed first out of 17 teams.
引用总数
20142015201620172018201920202021202220232024516185444393525231512
学术搜索中的文章
N Neverova, C Wolf, GW Taylor, F Nebout - Computer Vision-ECCV 2014 Workshops: Zurich …, 2015