作者
Vassilis Pitsikalis, Athanassios Katsamanis, George Papandreou, Petros Maragos
发表日期
2006
研讨会论文
Ninth International Conference on Spoken Language Processing
简介
While the accuracy of feature measurements heavily depends on changing environmental conditions, studying the consequences of this fact in pattern recognition tasks has received relatively little attention to date. In this work we explicitly take into account feature measurement uncertainty and we show how classification rules should be adjusted to compensate for its effects. Our approach is particularly fruitful in multimodal fusion scenarios, such as audiovisual speech recognition, where multiple streams of complementary time-evolving features are integrated. For such applications, provided that the measurement noise uncertainty for each feature stream can be estimated, the proposed framework leads to highly adaptive multimodal fusion rules which are widely applicable and easy to implement. We further show that previous multimodal fusion methods relying on stream weights fall under our scheme under certain assumptions; this provides novel insights into their applicability for various tasks and suggests new practical ways for estimating the stream weights adaptively. The potential of our approach is demonstrated in audio-visual speech recognition using either synchronous or asynchronous models.
引用总数
200520062007200820092010201120122013201420152016201720182019202020212022202311341215753332221
学术搜索中的文章
V Pitsikalis, A Katsamanis, G Papandreou, P Maragos - INTERSPEECH, 2006