This paper describes first large scale article detection and extraction efforts on the Finnish Digi1 newspaper material of the National Library of Finland (NLF) using data of one …
The Pivan web platform is an open-source tool for managing different stages of automatic document processing, such as layout analysis, transcription, and named entity recognition. It …