Deep learning for historical document analysis and recognition—a survey

F Lombardi, S Marinai - Journal of Imaging, 2020 - mdpi.com
Nowadays, deep learning methods are employed in a broad range of research fields. The
analysis and recognition of historical documents, as we survey in this work, is not an …

Improved training of generative adversarial networks using representative features

D Bang, H Shim - International conference on machine …, 2018 - proceedings.mlr.press
Despite the success of generative adversarial networks (GANs) for image generation, the
trade-off between visual quality and image diversity remains a significant issue. This paper …

Extracting text from scanned Arabic books: a large-scale benchmark dataset and a fine-tuned Faster-R-CNN model

R Elanwar, W Qin, M Betke, D Wijaya - International Journal on Document …, 2021 - Springer
Datasets of documents in Arabic are urgently needed to promote computer vision and
natural language processing research that addresses the specifics of the language …

Indiscapes: Instance segmentation networks for layout parsing of historical indic manuscripts

A Prusty, S Aitha, A Trivedi… - … on Document Analysis …, 2019 - ieeexplore.ieee.org
Historical palm-leaf manuscript and early paper documents from Indian subcontinent form
an important part of the world's literary and cultural heritage. Despite their importance, large …

Binarization free layout analysis for arabic historical documents using fully convolutional networks

BK Barakat, J El-Sana - … workshop on arabic and derived script …, 2018 - ieeexplore.ieee.org
We present a Fully Convolutional Network based method for layout analysis of non-
binarized historical Arabic manuscripts. The document image is segmented into main text …

Layout analysis on challenging historical arabic manuscripts using siamese network

R Alaasam, B Kurar, J El-Sana - 2019 International Conference …, 2019 - ieeexplore.ieee.org
This paper presents layout analysis for historical Arabic documents using siamese network.
Given pages from different documents, we divide them into patches of similar sizes. We train …

Historical document layout analysis using anisotropic diffusion and geometric features

GM BinMakhashen, SA Mahmoud - International Journal on Digital …, 2020 - Springer
There are several digital libraries worldwide which maintain valuable historical manuscripts.
Usually, digital copies of these manuscripts are offered to researchers and readers in raster …

A pitfall of unsupervised pre-training

M Alberti, M Seuret, R Ingold, M Liwicki - arXiv preprint arXiv:1703.04332, 2017 - arxiv.org
The point of this paper is to question typical assumptions in deep learning and suggest
alternatives. A particular contribution is to prove that even if a Stacked Convolutional Auto …

Making scanned Arabic documents machine accessible using an ensemble of SVM classifiers

R Elanwar, W Qin, M Betke - International Journal on Document Analysis …, 2018 - Springer
Raster-image PDF files originating from scanning or photographing paper documents are
inaccessible to both text search engines and screen readers that people with visual …

A comparative study of two state-of-the-art feature selection algorithms for texture-based pixel-labeling task of ancient documents

M Mehri, R Chaieb, K Kalti, P Héroux, R Mullot… - Journal of …, 2018 - mdpi.com
Recently, texture features have been widely used for historical document image analysis.
However, few studies have focused exclusively on feature selection algorithms for historical …