[PDF][PDF] Learning attentional mechanisms for simultaneous object tracking and recognition with deep networks

L Bazzani, N de Freitas, JA Ting - NIPS 2010 Deep Learning and …, 2010 - cs.ubc.ca
NIPS 2010 Deep Learning and Unsupervised Feature Learning Workshop, 2010cs.ubc.ca
We propose a novel attentional model for simultaneous object tracking and recognition that
is driven by gaze data. Motivated by theories of the human perceptual system, the model
consists of two interacting pathways: ventral and dorsal. The ventral pathway models object
appearance and classification using deep (factored)-restricted Boltzmann machines. At each
point in time, the observations consist of retinal images; with decaying resolution toward the
periphery of the gaze. The dorsal pathway models the location, orientation, scale and speed …
Abstract
We propose a novel attentional model for simultaneous object tracking and recognition that is driven by gaze data. Motivated by theories of the human perceptual system, the model consists of two interacting pathways: ventral and dorsal. The ventral pathway models object appearance and classification using deep (factored)-restricted Boltzmann machines. At each point in time, the observations consist of retinal images; with decaying resolution toward the periphery of the gaze. The dorsal pathway models the location, orientation, scale and speed of the attended object. The posterior distribution of these states is estimated with particle filtering. Deeper in the dorsal pathway, we encounter an attentional mechanism that learns to control gazes so as to maximize different objectives. Here we demonstrate the method when the objective is to minimize the uncertainty in the posterior distribution of the states. The approach is modular (with each module easily replaceable with more sophisticated algorithms), straightforward to implement, practically efficient, and works well in simple video sequences.
cs.ubc.ca
以上显示的是最相近的搜索结果。 查看全部搜索结果