Stagger: An open-source part of speech tagger for Swedish

R Östling - Northern European Journal of Language Technology, 2013 - nejlt.ep.liu.se
This work presents Stagger, a new open-source part of speech tagger for Swedish based on
the Averaged Perceptron. By using the SALDO morphological lexicon and semi-supervised …

[PDF][PDF] The American National Corpus first release.

N Ide, K Suderman - LREC, 2004 - cs.vassar.edu
Abstract The First Release of the American National Corpus (ANC) was made available in
mid-fall, 2003. The data includes approximately 11 million words of American English …

Part-of-speech tagger for assamese using ensembling approach

D Pathak, S Nandi, P Sarmah - ACM Transactions on Asian and Low …, 2023 - dl.acm.org
Ensemble system for part-of-speech (POS) tagging is beneficial for many resource-poor
languages that do not have enough annotated training data to train Deep Learning (DL, also …

Parallel bidirectionally pretrained taggers as feature generators

R Stanković, M Škorić, B Šandrih Todorović - Applied Sciences, 2022 - mdpi.com
In a setting where multiple automatic annotation approaches coexist and advance
separately but none completely solve a specific problem, the key might be in their …

[PDF][PDF] Automatic detection of English inclusions in mixed-lingual data with an application to parsing

B Alex - 2008 - researchgate.net
The influence of English continues to grow to the extent that its expressions have begun to
permeate the original forms of other languages. It has become more acceptable, and in …

[PDF][PDF] Developing a PoS-tagged corpus using existing tools

H Loftsson, JH Yngvason, S Helgadóttir… - … SaLTMiL Workshop on …, 2010 - academia.edu
In this paper, we describe the development of a new tagged corpus of Icelandic, consisting
of about 1 million tokens. The goal is to use the corpus, among other things, as a new gold …

[图书][B] A resource-light approach to morpho-syntactic tagging

A Feldman, J Hana - 2010 - books.google.com
While supervised corpus-based methods are highly accurate for different NLP tasks,
including morphological tagging, they are difficult to port to other languages because they …

Evaluation of different classifiers for sinhala pos tagging

S Fernando, S Ranathunga - 2018 Moratuwa Engineering …, 2018 - ieeexplore.ieee.org
This paper presents a comparative evaluation of three state-of-the-art classifiers for Sinhala
Parts-of-Speech (POS) tagging. Support Vector Machines (SVM), Hidden Markov Models …

[PDF][PDF] Integrating Linguistic Resources: The American National Corpus Model.

N Ide, K Suderman - LREC, 2006 - lrec-conf.org
This paper describes the architecture of the American National Corpus and the design
decisions we have made in order to make the corpus easy to use with a variety of existing …

[PDF][PDF] Combitagger: A system for developing combined taggers

V Henrich, T Reuter, H Loftsson - Twenty-Second International FLAIRS …, 2009 - cdn.aaai.org
The main task of part-of-speech (PoS) tagging is to assign the appropriate morphosyntactic
category to each word in a sentence. A combination of different PoS taggers usually results …