Selecting autoencoder features for layout analysis of historical documents

H Wei, M Seuret, K Chen, A Fischer, M Liwicki… - Proceedings of the 3rd …, 2015 - dl.acm.org
H Wei, M Seuret, K Chen, A Fischer, M Liwicki, R Ingold
Proceedings of the 3rd International Workshop on Historical Document Imaging …, 2015dl.acm.org
Automatic layout analysis of historical documents has to cope with a large number of
different scripts, writing supports, and digitalization qualities. Under these conditions, the
design of robust features for machine learning is a highly challenging task. We use
convolutional autoencoders to learn features from the images. In order to increase the
classification accuracy and to reduce the feature dimension, in this paper we propose a
novel feature selection method. The method cascades adapted versions of two conventional …
Automatic layout analysis of historical documents has to cope with a large number of different scripts, writing supports, and digitalization qualities. Under these conditions, the design of robust features for machine learning is a highly challenging task. We use convolutional autoencoders to learn features from the images. In order to increase the classification accuracy and to reduce the feature dimension, in this paper we propose a novel feature selection method. The method cascades adapted versions of two conventional methods. Compared to three conventional methods and our previous work, the proposed method achieves a higher classification accuracy in most cases, while maintaining low feature dimension. In addition, we find that a significant number of autoencoder features are redundant or irrelevant for the classification, and we give our explanations. To the best of our knowledge, this paper is one of the first investigations in the field of image processing on the detection of redundancy and irrelevance of autoencoder features using feature selection.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果