Online multi-modal task-driven dictionary learning and robust joint sparse representation for visual tracking

A Taalimi, H Qi, R Khorsandi - 2015 12th IEEE International …, 2015 - ieeexplore.ieee.org
2015 12th IEEE International Conference on Advanced Video and …, 2015ieeexplore.ieee.org
Robust visual tracking is a challenging problem due to pose variance, occlusion and
cluttered backgrounds. No single feature can be robust to all possible scenarios in a video
sequence. However, exploiting multiple features has demonstrated its effectiveness in
overcoming challenging situations in visual tracking. We propose a new framework for multi-
modal fusion at both the feature level and decision level by training a reconstructive and
discriminative dictionary and classifier for each modality simultaneously with the additional …
Robust visual tracking is a challenging problem due to pose variance, occlusion and cluttered backgrounds. No single feature can be robust to all possible scenarios in a video sequence. However, exploiting multiple features has demonstrated its effectiveness in overcoming challenging situations in visual tracking. We propose a new framework for multi-modal fusion at both the feature level and decision level by training a reconstructive and discriminative dictionary and classifier for each modality simultaneously with the additional constraint of label consistency across different modalities. In addition, a joint decision measure is designed based on both reconstruction and classification error to adaptively adjust the weights of different features such that unreliable features can be removed from tracking. The proposed tracking scheme is referred to as the label-consistent and fusion-based joint sparse coding (LC-FJSC). Extensive experiments on publicly available videos demonstrate that LC-FJSC outperforms state-of-the-art trackers.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果