作者
Volkan Kılıç, Mark Barnard, Wenwu Wang, Josef Kittler
发表日期
2015/2
期刊
IEEE Transactions on Multimedia
卷号
17
期号
2
页码范围
186-200
出版商
IEEE
简介
The problem of tracking multiple moving speakers in indoor environments has received much attention. Earlier techniques were based purely on a single modality, e.g., vision. Recently, the fusion of multi-modal information has been shown to be instrumental in improving tracking performance, as well as robustness in the case of challenging situations like occlusions (by the limited field of view of cameras or by other speakers). However, data fusion algorithms often suffer from noise corrupting the sensor measurements which cause non-negligible detection errors. Here, a novel approach to combining audio and visual data is proposed. We employ the direction of arrival angles of the audio sources to reshape the typical Gaussian noise distribution of particles in the propagation step and to weight the observation model in the measurement step. This approach is further improved by solving a typical problem …
引用总数
2015201620172018201920202021202220232024612151014106723
学术搜索中的文章
V Kılıç, M Barnard, W Wang, J Kittler - IEEE Transactions on Multimedia, 2014