Structext: Structured text understanding with multi-modal transformers

Y Li, Y Qian, Y Yu, X Qin, C Zhang, Y Liu… - Proceedings of the 29th …, 2021 - dl.acm.org
Structured text understanding on Visually Rich Documents (VRDs) is a crucial part of
Document Intelligence. Due to the complexity of content and layout in VRDs, structured text …

Kleister: A novel task for information extraction involving long documents with complex layout

F Graliński, T Stanisławek, A Wróblewska… - arXiv preprint arXiv …, 2020 - arxiv.org
State-of-the-art solutions for Natural Language Processing (NLP) are able to capture a
broad range of contexts, like the sentence-level context or document-level context for short …

Deterministic routing between layout abstractions for multi-scale classification of visually rich documents

R Sarkhel, A Nandi - 28th International Joint Conference on Artificial …, 2019 - par.nsf.gov
Classifying heterogeneous visually rich documents is a challenging task. Difficulty of this
task increases even more if the maximum allowed inference turnaround time is constrained …

Interpretable multi-headed attention for abstractive summarization at controllable lengths

R Sarkhel, M Keymanesh, A Nandi… - arXiv preprint arXiv …, 2020 - arxiv.org
Abstractive summarization at controllable lengths is a challenging task in natural language
processing. It is even more challenging for domains where limited training data is available …

Self-training for label-efficient information extraction from semi-structured web-pages

R Sarkhel, B Huang, C Lockard… - Proceedings of the VLDB …, 2023 - dl.acm.org
Information Extraction (IE) from semi-structured web-pages is a long studied problem.
Training a model for this extraction task requires a large number of human-labeled samples …

Glean: Structured extractions from templatic documents

S Tata, N Potti, JB Wendt, LB Costa, M Najork… - Proceedings of the …, 2021 - dl.acm.org
Extracting structured information from templatic documents is an important problem with the
potential to automate many real-world business workflows such as payment, procurement …

Improving information extraction from visually rich documents using visual span representations

R Sarkhel, A Nandi - Proceedings of the VLDB Endowment, 2021 - par.nsf.gov
Along with textual content, visual features play an essential role in the semantics of visually
rich documents. Information extraction (IE) tasks perform poorly on these documents if these …

Selective Labeling: How to Radically Lower Data-Labeling Costs for Document Extraction Models

Y Zhou, JB Wendt, N Potti, J Xie… - Proceedings of the 2023 …, 2023 - aclanthology.org
Building automatic extraction models for visually rich documents like invoices, receipts, bills,
tax forms, etc. has received significant attention lately. A key bottleneck in developing …

Cross-modal entity matching for visually rich documents

R Sarkhel, A Nandi - arXiv preprint arXiv:2303.00720, 2023 - arxiv.org
Visually rich documents (VRD) are physical/digital documents that utilize visual cues to
augment their semantics. The information contained in these documents are often …

Noise-Aware Training of Layout-Aware Language Models

R Sarkhel, X Ren, LB Costa, G Su, V Perot… - arXiv preprint arXiv …, 2024 - arxiv.org
A visually rich document (VRD) utilizes visual features along with linguistic cues to
disseminate information. Training a custom extractor that identifies named entities from a …