作者
Matthew D Zeiler, Graham W Taylor, Rob Fergus
发表日期
2011/11/6
研讨会论文
2011 international conference on computer vision
页码范围
2018-2025
出版商
IEEE
简介
We present a hierarchical model that learns image decompositions via alternating layers of convolutional sparse coding and max pooling. When trained on natural images, the layers of our model capture image information in a variety of forms: low-level edges, mid-level edge junctions, high-level object parts and complete objects. To build our model we rely on a novel inference scheme that ensures each layer reconstructs the input, rather than just the output of the layer directly beneath, as is common with existing hierarchical approaches. This makes it possible to learn multiple layers of representation and we show models with 4 layers, trained on images from the Caltech-101 and 256 datasets. When combined with a standard classifier, features extracted from these models outperform SIFT, as well as representations from other feature learning methods.
引用总数
20122013201420152016201720182019202020212022202320242527586110115322220522421515213948
学术搜索中的文章
MD Zeiler, GW Taylor, R Fergus - 2011 international conference on computer vision, 2011