A comprehensive survey of mostly textual document segmentation algorithms since 2008

S Eskenazi, P Gomez-Krämer, JM Ogier - Pattern recognition, 2017 - Elsevier
In document image analysis, segmentation is the task that identifies the regions of a
document. The increasing number of applications of document analysis requires a good …

Two-stage generative adversarial networks for binarization of color document images

S Suh, J Kim, P Lukowicz, YO Lee - Pattern Recognition, 2022 - Elsevier
Document image enhancement and binarization methods are often used to improve the
accuracy and efficiency of document image analysis tasks such as text recognition …

Icdar 2013 competition on book structure extraction

A Doucet, G Kazai, S Colutto… - 2013 12th International …, 2013 - ieeexplore.ieee.org
This paper summarizes the 3rd Book Structure Extraction competition that was run at the
ICDAR 2013. Its goal is to evaluate and compare automatic techniques for deriving structure …

CNN based page object detection in document images

X Yi, L Gao, Y Liao, X Zhang, R Liu… - 2017 14th IAPR …, 2017 - ieeexplore.ieee.org
This electronic document is a" live" template. The various components of your paper [title,
text, heads, etc.] are Abstract-Object detection in natural scenes has been widely researched …

Efficient multiscale Sauvola's binarization

G Lazzara, T Géraud - International Journal on Document Analysis and …, 2014 - Springer
This work focuses on the most commonly used binarization method: Sauvola's. It performs
relatively well on classical documents, however, three main defects remain: the window …

Doccreator: A new software for creating synthetic ground-truthed document images

N Journet, M Visani, B Mansencal, K Van-Cuong… - Journal of …, 2017 - mdpi.com
Most digital libraries that provide user-friendly interfaces, enabling quick and intuitive access
to their resources, are based on Document Image Analysis and Recognition (DIAR) …

A deep learning-based formula detection method for PDF documents

L Gao, X Yi, Y Liao, Z Jiang, Z Yan… - 2017 14th IAPR …, 2017 - ieeexplore.ieee.org
In practice, PDF files may be generated by different tools and their character information
quality could be different. As a result, the approaches to detecting formulae from PDF …

Using convolutional encoder-decoder for document image binarization

X Peng, H Cao, P Natarajan - 2017 14th IAPR international …, 2017 - ieeexplore.ieee.org
Document image binarization is one of the critical initial steps for document analysis and
understanding. Previous work mostly focused on exploiting hand-crafted features to build …

Historical collaborative geocoding

R Cura, B Dumenieu, N Abadie, B Costes… - … International Journal of …, 2018 - mdpi.com
The latest developments in the field of digital humanities have increasingly enabled the
construction of large data sets which can be easily accessed and used. These data sets …

Icdar 2013 competition on historical newspaper layout analysis (hnla 2013)

A Antonacopoulos, C Clausner… - 2013 12th …, 2013 - ieeexplore.ieee.org
This paper presents an objective comparative evaluation of layout analysis methods for
scanned historical newspapers. It describes the competition (modus operandi, dataset and …