作者
Yu-Gang Jiang, Jun Yang, Chong-Wah Ngo, Alexander G Hauptmann
发表日期
2009/11/13
期刊
IEEE Transactions on Multimedia
卷号
12
期号
1
页码范围
42-53
出版商
IEEE
简介
Based on the local keypoints extracted as salient image patches, an image can be described as a ¿bag-of-visual-words (BoW)¿ and this representation has appeared promising for object and scene classification. The performance of BoW features in semantic concept detection for large-scale multimedia databases is subject to various representation choices. In this paper, we conduct a comprehensive study on the representation choices of BoW, including vocabulary size, weighting scheme, stop word removal, feature selection, spatial information, and visual bi-gram. We offer practical insights in how to optimize the performance of BoW by choosing appropriate representation choices. For the weighting scheme, we elaborate a soft-weighting method to assess the significance of a visual word to an image. We experimentally show that the soft-weighting outperforms other popular weighting schemes such as TF-IDF …
引用总数
学术搜索中的文章
YG Jiang, J Yang, CW Ngo, AG Hauptmann - IEEE Transactions on Multimedia, 2009