Document layout analysis: a comprehensive survey

GM Binmakhashen, SA Mahmoud - ACM Computing Surveys (CSUR), 2019 - dl.acm.org
Document layout analysis (DLA) is a preprocessing step of document understanding
systems. It is responsible for detecting and annotating the physical structure of documents …

Hyperspectral document image processing: Applications, challenges and future prospects

R Qureshi, M Uzair, K Khurshid, H Yan - Pattern Recognition, 2019 - Elsevier
Automatic image analysis is a crucial component of many intelligent systems designed for
high-level understanding of documents. Most document image understanding systems are …

Publaynet: largest dataset ever for document layout analysis

X Zhong, J Tang, AJ Yepes - 2019 International conference on …, 2019 - ieeexplore.ieee.org
Recognizing the layout of unstructured digital documents is an important step when parsing
the documents into structured machine-readable format for downstream applications. Deep …

Towards end-to-end unified scene text detection and layout analysis

S Long, S Qin, D Panteleev… - Proceedings of the …, 2022 - openaccess.thecvf.com
Scene text detection and document layout analysis have long been treated as two separate
tasks in different image domains. In this paper, we bring them together and introduce the …

A new local adaptive thresholding technique in binarization

TR Singh, S Roy, OI Singh, T Sinam… - arXiv preprint arXiv …, 2012 - arxiv.org
Image binarization is the process of separation of pixel values into two groups, white as
background and black as foreground. Thresholding plays a major in binarization of images …

Efficient implementation of local adaptive thresholding techniques using integral images

F Shafait, D Keysers, TM Breuel - Document recognition and …, 2008 - spiedigitallibrary.org
Adaptive binarization is an important first step in many document analysis and OCR
processes. This paper describes a fast adaptive binarization algorithm that yields the same …

New filtering approaches for phishing email

A Bergholz, J De Beer, S Glahn… - Journal of computer …, 2010 - content.iospress.com
Phishing emails usually contain a message from a credible looking source requesting a user
to click a link to a website where she/he is asked to enter a password or other confidential …

Performance evaluation and benchmarking of six-page segmentation algorithms

F Shafait, D Keysers, T Breuel - IEEE Transactions on Pattern …, 2008 - ieeexplore.ieee.org
Informative benchmarks are crucial for optimizing the page segmentation step of an OCR
system, frequently the performance limiting step for overall OCR system performance. We …

Two geometric algorithms for layout analysis

TM Breuel - Document Analysis Systems V: 5th International …, 2002 - Springer
This paper presents geometric algorithms for solving two key problems in layout analysis:
finding a cover of the background whitespace of a document in terms of maximal empty …

Beyond document object detection: instance-level segmentation of complex layouts

S Biswas, P Riba, J Lladós, U Pal - International Journal on Document …, 2021 - Springer
Abstract Information extraction is a fundamental task of many business intelligence services
that entail massive document processing. Understanding a document page structure in …