Abstract The First Release of the American National Corpus (ANC) was made available in mid-fall, 2003. The data includes approximately 11 million words of American English …
Ensemble system for part-of-speech (POS) tagging is beneficial for many resource-poor languages that do not have enough annotated training data to train Deep Learning (DL, also …
R Stanković, M Škorić, B Šandrih Todorović - Applied Sciences, 2022 - mdpi.com
In a setting where multiple automatic annotation approaches coexist and advance separately but none completely solve a specific problem, the key might be in their …
The influence of English continues to grow to the extent that its expressions have begun to permeate the original forms of other languages. It has become more acceptable, and in …
In this paper, we describe the development of a new tagged corpus of Icelandic, consisting of about 1 million tokens. The goal is to use the corpus, among other things, as a new gold …
While supervised corpus-based methods are highly accurate for different NLP tasks, including morphological tagging, they are difficult to port to other languages because they …
This paper presents a comparative evaluation of three state-of-the-art classifiers for Sinhala Parts-of-Speech (POS) tagging. Support Vector Machines (SVM), Hidden Markov Models …
This paper describes the architecture of the American National Corpus and the design decisions we have made in order to make the corpus easy to use with a variety of existing …
V Henrich, T Reuter, H Loftsson - Twenty-Second International FLAIRS …, 2009 - cdn.aaai.org
The main task of part-of-speech (PoS) tagging is to assign the appropriate morphosyntactic category to each word in a sentence. A combination of different PoS taggers usually results …