Table pre-training: A survey on model architectures, pre-training objectives, and downstream tasks

H Dong, Z Cheng, X He, M Zhou, A Zhou… - arXiv preprint arXiv …, 2022 - arxiv.org
Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs,
and various other document types, a flurry of table pre-training frameworks have been …

Current status and performance analysis of table recognition in document images with deep neural networks

KA Hashmi, M Liwicki, D Stricker, MA Afzal… - IEEE …, 2021 - ieeexplore.ieee.org
The first phase of table recognition is to detect the tabular area in a document.
Subsequently, the tabular structures are recognized in the second phase in order to extract …

Dit: Self-supervised pre-training for document image transformer

J Li, Y Xu, T Lv, L Cui, C Zhang, F Wei - Proceedings of the 30th ACM …, 2022 - dl.acm.org
Image Transformer has recently achieved significant progress for natural image
understanding, either using supervised (ViT, DeiT, etc.) or self-supervised (BEiT, MAE, etc.) …

CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents

D Prasad, A Gadpal, K Kapadni… - Proceedings of the …, 2020 - openaccess.thecvf.com
An automatic table recognition method for interpretation of tabular data in document images
majorly involves solving two problems of table detection and table structure recognition. The …

LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis

Z Shen, R Zhang, M Dell, BCG Lee, J Carlson… - Document Analysis and …, 2021 - Springer
Recent advances in document image analysis (DIA) have been primarily driven by the
application of neural networks. Ideally, research outcomes could be easily deployed in …

DocBank: A benchmark dataset for document layout analysis

M Li, Y Xu, L Cui, S Huang, F Wei, Z Li… - arXiv preprint arXiv …, 2020 - arxiv.org
Document layout analysis usually relies on computer vision models to understand
documents while ignoring textual information that is vital to capture. Meanwhile, high quality …

Layoutxlm: Multimodal pre-training for multilingual visually-rich document understanding

Y Xu, T Lv, L Cui, G Wang, Y Lu, D Florencio… - arXiv preprint arXiv …, 2021 - arxiv.org
Multimodal pre-training with text, layout, and image has achieved SOTA performance for
visually-rich document understanding tasks recently, which demonstrates the great potential …

PubTables-1M: Towards comprehensive table extraction from unstructured documents

B Smock, R Pesala, R Abraham - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Recently, significant progress has been made applying machine learning to the problem of
table structure inference and extraction from unstructured documents. However, one of the …

Table structure recognition using top-down and bottom-up cues

S Raja, A Mondal, CV Jawahar - … Conference, Glasgow, UK, August 23–28 …, 2020 - Springer
Tables are information-rich structured objects in document images. While significant work
has been done in localizing tables as graphic objects in document images, only limited …

Global table extractor (gte): A framework for joint table identification and cell structure recognition using visual context

X Zheng, D Burdick, L Popa… - Proceedings of the …, 2021 - openaccess.thecvf.com
Documents are often the format of choice for knowledge sharing and preservation in
business and science, within which are tables that capture most of the critical data …