作者
Yu-Gang Jiang, Jun Yang, Chong-Wah Ngo, Alexander G Hauptmann
发表日期
2010/1
期刊
Multimedia, IEEE Transactions on
卷号
12
期号
1
页码范围
42-53
出版商
IEEE
简介
Based on the local keypoints extracted as salient image patches, an image can be described as a ¿bag-of-visual-words (BoW)¿ and this representation has appeared promising for object and scene classification. The performance of BoW features in semantic concept detection for large-scale multimedia databases is subject to various representation choices. In this paper, we conduct a comprehensive study on the representation choices of BoW, including vocabulary size, weighting scheme, stop word removal, feature selection, spatial information, and visual bi-gram. We offer practical insights in how to optimize the performance of BoW by choosing appropriate representation choices. For the weighting scheme, we elaborate a soft-weighting method to assess the significance of a visual word to an image. We experimentally show that the soft-weighting outperforms other popular weighting schemes such as TF-IDF …
引用总数
200820092010201120122013201420152016201720182019202020212022202320243321333552333535221310108842
学术搜索中的文章