Research in the digital humanities and computational social sciences requires overcoming complexity in research data, methodology, and research questions. In this article, we show …
A great deal of historical corpora suffer from errors introduced by the OCR (optical character recognition) methods used in the digitization process. Correcting these errors manually is a …
R Bawden, J Poinhos, E Kogkitsidou… - Proceedings of the …, 2022 - aclanthology.org
Spelling normalisation is a useful step in the study and analysis of historical language texts, whether it is manual analysis by experts or automatic analysis using downstream natural …
Abstract Machine translation is one of the applications of natural language processing which has been explored in different languages. Recently researchers started paying attention …
We compare different LSTMs and transformer models in terms of their effectiveness in normalizing dialectal Finnish into the normative standard Finnish. As dialect is the common …
O Kuparinen - Tenth Workshop on NLP for Similar Languages …, 2023 - aclanthology.org
This paper presents Murreviikko, a dataset of dialectal Finnish tweets which have been dialectologically annotated and manually normalized to a standard form. The dataset can be …
M Bollmann - 2018 - hss-opus.ub.ruhr-uni-bochum.de
With the increasing availability of digitized resources of historical documents, interest in effective natural language processing (NLP) for these documents is on the rise. However …
This paper presents an overview of the LL (O) D and NLP methods, tools and data for detecting and representing semantic change, with its main application in humanities …
Texts written in Old Literary Finnish represent the first literary work ever written in Finnish starting from the 16th century. There have been several projects in Finland that have …