Transformers-based information extraction with limited data for domain-specific business documents

MT Nguyen, DT Le, L Le - Engineering Applications of Artificial Intelligence, 2021 - Elsevier
Abstract Information extraction plays an important role for data transformation in business
cases. However, building extraction systems in actual cases face two challenges:(i) the …

Improving information extraction on business documents with specific pre-training tasks

T Douzon, S Duffner, C Garcia, J Espinas - International Workshop on …, 2022 - Springer
Abstract Transformer-based Language Models are widely used in Natural Language
Processing related tasks. Thanks to their pre-training, they have been successfully adapted …

Transfer learning for information extraction with limited data

MT Nguyen, VA Phan, LT Linh, NH Son… - … Conference of the …, 2020 - Springer
This paper presents a practical approach to fine-grained information extraction. Through
plenty of authors' experiences in practically applying information extraction to business …

Key information extraction from documents: Evaluation and generator

O Bensch, M Popa, C Spille - arXiv preprint arXiv:2106.14624, 2021 - arxiv.org
Extracting information from documents usually relies on natural language processing
methods working on one-dimensional sequences of text. In some cases, for example, for the …

Attend, copy, parse end-to-end information extraction from documents

RB Palm, F Laws, O Winther - 2019 International Conference …, 2019 - ieeexplore.ieee.org
Document information extraction tasks performed by humans create data consisting of a
PDF or document image input, and extracted string outputs. This end-to-end data is naturally …

Information extraction from invoices

A Hamdi, E Carel, A Joseph, M Coustaty… - … Conference on Document …, 2021 - Springer
The present paper is focused on information extraction from key fields of invoices using two
different methods based on sequence labeling. Invoices are semi-structured documents in …

Business document information extraction: Towards practical benchmarks

M Skalický, Š Šimsa, M Uřičář, M Šulc - International Conference of the …, 2022 - Springer
Abstract Information extraction from semi-structured documents is crucial for frictionless
business-to-business (B2B) communication. While machine learning problems related to …

A comparative study of information extraction strategies using an attention-based neural network

S Tarride, A Lemaitre, B Coüasnon… - International Workshop on …, 2022 - Springer
This article focuses on information extraction in historical handwritten marriage records.
Traditional approaches rely on a sequential pipeline of two consecutive tasks: handwriting …

Rapid adaptation of bert for information extraction on domain-specific business documents

R Zhang, W Yang, L Lin, Z Tu, Y Xie, Z Fu, Y Xie… - arXiv preprint arXiv …, 2020 - arxiv.org
Techniques for automatically extracting important content elements from business
documents such as contracts, statements, and filings have the potential to make business …

Information extraction from free-form CV documents in multiple languages

D Vukadin, AS Kurdija, G Delač, M Šilić - IEEE access, 2021 - ieeexplore.ieee.org
This paper proposes two natural language processing models for extracting useful
information from multilingual, unstructured (free form) CV documents. The model identifies …