GeoContrastNet: Contrastive Key-Value Edge Learning for Language-Agnostic Document Understanding

N Biescas, C Boned, J Lladós, S Biswas - International Conference on …, 2024 - Springer
This paper presents GeoContrastNet, a language-agnostic framework to structured
document understanding (DU) by integrating a contrastive learning objective with graph …

EntityLayout: Entity-Level Pre-training Language Model for Semantic Entity Recognition and Relation Extraction

CB Xu, YM Chen, CL Liu - International Conference on Document Analysis …, 2024 - Springer
Semantic entity recognition (SER) and relation extraction (RE) are the core tasks of
information extraction from visually-rich documents (VrDs). Although self-supervised pre …

How Does Changing the Optical Character Recognition System Impact the Layout-Aware Named Entity Recognition Models?

J Macedo, B Bezerra, C Zanchettin - International Workshop on Document …, 2024 - Springer
Merging information from physical and digital documents is essential in an era when
information is becoming even more relevant. Different strategies have been used to combine …

[PDF][PDF] Information Extraction from Business Documents

M Geletka, M Bankovič, D Meluš… - RASLAN 2022 Recent …, 2022 - nlp.fi.muni.cz
Document AI is a relatively new research topic that refers to techniques for automatically
reading, understanding, and analyzing business documents. Nowadays, many companies …

Information Redundancy and Biases in Public Document Information Extraction Benchmarks

S Laatiri, P Ratnamogan, J Tang, L Lam… - … on Document Analysis …, 2023 - Springer
Advances in the Visually-rich Document Understanding (VrDU) field and particularly the Key-
Information Extraction (KIE) task are marked with the emergence of efficient Transformer …

[PDF][PDF] A data-centric approach to information extraction of expenses

M Wilk, A Verriest, A Legay - dial.uclouvain.be
The study of automatic processing and comprehension of business documents is known as
Document AI. It covers activities such as reading and evaluating a document's content …