Data Rescue for Historical Document Tables Using Semi-Supervised Learning

LG Singh, SE Middleton - 2024 - researchsquare.com
This study uses a novel semi-supervised learning framework to explore Tabular Structure
Recognition (TSR) for digitizing historical documents, specifically employing the …

Enabling Deep Document Image Analysis with Generative Models

K Nikolaidou - 2023 - diva-portal.org
Historical documents are a valuable source of cultural knowledge and can provide
information about previous events, societies, beliefs, and cultures. They can serve as an …

Investigations on Self-supervised Learning for Script-, Font-type, and Location Classification on Historical Documents

J Zenk, F Kordon, M Mayr, M Seuret… - Proceedings of the 7th …, 2023 - dl.acm.org
In the context of automated classification of historical documents, we investigate three
contemporary self-supervised learning (SSL) techniques (SimSiam, Dino, and VICReg) for …

Visual unsupervised deep learning model design for historical document image analysis

M Omrani Tamrin - 2022 - espace.etsmtl.ca
Historical documents are one of the most crucial influences that drive scientific and historical
development. Some historical documents are present and can be used through classical …

GloSAT historical measurement table dataset: enhanced table structure recognition annotation for downstream historical data rescue

J Ziomek, SE Middleton - Proceedings of the 6th International Workshop …, 2021 - dl.acm.org
Understanding and extracting tables from documents is a research problem that has been
studied for decades. Table structure recognition is the labelling of components within a …

Historical document processing

B Gatos, G Louloudis, N Stamatopoulos… - Proceedings of the 2017 …, 2017 - dl.acm.org
This tutorial focuses on recent advances and ongoing developments for historical document
processing. It includes the main challenges involved, the different tasks that have to be …

Curation of historical Arabic handwritten digit datasets from Ottoman population registers: a deep transfer learning case study

YS Can, ME Kabadayı - … Conference on Big Data (Big Data), 2020 - ieeexplore.ieee.org
With the increasing number of digitization efforts of historical manuscripts and archives,
automatical information retrieval systems need to extract meaning fast and reliably …

Mambatab: A simple yet effective approach for handling tabular data

MA Ahamed, Q Cheng - arXiv preprint arXiv:2401.08867, 2024 - arxiv.org
Tabular data remains ubiquitous across domains despite growing use of images and texts
for machine learning. While deep learning models like convolutional neural networks and …

An Evaluation of Handwritten Text Recognition Methods for Historical Ciphered Manuscripts

MA Souibgui, P Torras, J Chen, A Fornés - Proceedings of the 7th …, 2023 - dl.acm.org
This paper investigates the effectiveness of different deep learning HTR families, including
LSTM, Seq2Seq, and transformer-based approaches with self-supervised pretraining, in …

Adaptive scaling for archival table structure recognition

XH Li, F Yin, XY Zhang, CL Liu - … September 5–10, 2021, Proceedings, Part …, 2021 - Springer
Table detection and structure recognition from archival document images remain
challenging due to diverse table structures, complex document layouts, degraded image …