Weigh your words—memory-based lemmatization for Middle Dutch

M Bollmann - arXiv preprint arXiv:1904.02036, 2019 - arxiv.org

There is no consensus on the state-of-the-art approach to historical text normalization. Many
techniques have been proposed, including rule-based methods, distance metrics, character …

被引用次数：96 相关文章所有 7 个版本

[图书][B] Quantitative historical linguistics: A corpus framework

GB Jenset, B McGillivray - 2017 - books.google.com

This book is an innovative guide to quantitative, corpus-based research in historical and
diachronic linguistics. Gard B. Jenset and Barbara McGillivray argue that, although historical …

被引用次数：108 相关文章所有 6 个版本

[PDF] ugent.be

Collaborative authorship in the twelfth century: A stylometric study of Hildegard of Bingen and Guibert of Gembloux

M Kestemont, S Moens… - Digital Scholarship in the …, 2015 - academic.oup.com

Abstract Hildegard of Bingen (1098–1179) is one of the most influential female authors of
the Middle Ages. From the point of view of computational stylistics, the oeuvre attributed to …

被引用次数：119 相关文章所有 11 个版本

[PDF] hal.science

Modernizing historical Slovene words with character-based SMT

Y Scherrer, T Erjavec - BSNLP 2013-4th Biennial Workshop on …, 2013 - inria.hal.science

We propose a language-independent word normalization method exemplified on
modernizing historical Slovene words. Our method relies on character-based statistical …

被引用次数：54 相关文章所有 14 个版本

[PDF] ruhr-uni-bochum.de

[PDF][PDF] Normalization of historical texts with neural network models

M Bollmann - 2018 - hss-opus.ub.ruhr-uni-bochum.de

With the increasing availability of digitized resources of historical documents, interest in
effective natural language processing (NLP) for these documents is on the rise. However …

被引用次数：31 相关文章所有 7 个版本

[PDF] iospress.com

LL (O) D and NLP perspectives on semantic change for Humanities research

F Armaselu, ES Apostol, AF Khan… - Semantic …, 2022 - content.iospress.com

This paper presents an overview of the LL (O) D and NLP methods, tools and data for
detecting and representing semantic change, with its main application in humanities …

被引用次数：15 相关文章所有 13 个版本

[PDF] ua.ac.be

Lemmatization for variation-rich languages using deep learning

M Kestemont, G De Pauw, R van Nie… - Digital Scholarship in …, 2017 - academic.oup.com

In this article, we describe a novel approach to sequence tagging for languages that are rich
in (eg orthographic) surface variation. We focus on lemmatization, a basic step in many …

被引用次数：33 相关文章所有 7 个版本

Lemmatization for ancient languages: Rules or neural networks?

O Dereza - Artificial Intelligence and Natural Language: 7th …, 2018 - Springer

Lemmatisation, which is one of the most important stages of text preprocessing, consists in
grouping the inflected forms of a word together so they can be analysed as a single item …

被引用次数：24 相关文章所有 3 个版本

[PDF] unige.ch

Modernising historical Slovene words

Y Scherrer, T Erjavec - Natural Language Engineering, 2016 - cambridge.org

We propose a language-independent word normalisation method and exemplify it on
modernising historical Slovene words. Our method relies on character-level statistical …

被引用次数：34 相关文章所有 5 个版本

[PDF] jlcl.org

[PDF][PDF] From old texts to modern spellings: an experiment in automatic normalisation

I Hendrickx, R Marquilhas - Journal for Language Technology and …, 2011 - jlcl.org

We aim to tackle the problem of spelling variations in a corpus of personal Portugese letters
from the 16th to the 20th century. We investigated the extent to which the task of normalising …

被引用次数：42 相关文章所有 7 个版本

高级搜索

QQ 群