A resource-light approach to morpho-syntactic tagging

T Erjavec - Language resources and evaluation, 2012 - Springer

The paper presents the MULTEXT-East language resources, a multilingual dataset for
language engineering research, focused on the morphosyntactic level of linguistic …

被引用次数：137 相关文章所有 8 个版本

[PDF] osf.io

[图书][B] Descriptive grammar of Bangla

AB David - 2015 - books.google.com

Bangla is spoken as the majority language in Bangladesh and the state of West Bengal in
India, and as a minority language in several other Indian states. With almost 200 million …

被引用次数：35 相关文章所有 8 个版本

[PDF] ox.ac.uk

[PDF][PDF] Using Resource-Rich Languages to Improve Morphological Analysis of Under-Resourced Languages.

P Baumann, JB Pierrehumbert - LREC, 2014 - phon.ox.ac.uk

The world-wide proliferation of digital communications has created the need for language
and speech processing systems for underresourced languages. Developing such systems is …

被引用次数：26 相关文章所有 12 个版本

[PDF] arxiv.org

Multext-east

T Erjavec - Handbook of linguistic annotation, 2017 - Springer

The chapter presents the MULTEXT-East language resources, a multilingual dataset for
language engineering research, focused on the morphosyntactic level of linguistic …

被引用次数：17 相关文章所有 5 个版本

[PDF] aclanthology.org

[PDF][PDF] A low-budget tagger for Old Czech

J Hana, A Feldman, K Aharodnik - … of the 5th ACL-HLT Workshop …, 2011 - aclanthology.org

The paper describes a tagger for Old Czech (1200-1500 AD), a fusional language with rich
morphology. The practical restrictions (no native speakers, limited corpora and lexicons …

被引用次数：15 相关文章所有 16 个版本

[PDF] aclanthology.org

[PDF][PDF] Generating learner-like morphological errors in Russian

M Dickinson - Proceedings of the 23rd International Conference …, 2010 - aclanthology.org

To speed up the process of categorizing learner errors and obtaining data for languages
which lack error-annotated data, we describe a linguistically-informed method for generating …

被引用次数：16 相关文章所有 8 个版本

[PDF] byu.edu

[图书][B] A probabilistic morphological analyzer for Syriac

PJ McClanahan - 2010 - search.proquest.com

We show that a carefully crafted probabilistic morphological analyzer significantly
outperforms a reasonable, naive baseline for Syriac. Syriac is an under-resourced Semitic …

被引用次数：15 相关文章所有 11 个版本

[PDF] academia.edu

[PDF][PDF] Building an old Occitan corpus via cross-Language transfer.

O Scrivner, S Kübler - KONVENS, 2012 - academia.edu

This paper describes the implementation of a resource-light approach, cross-language
transfer, to build and annotate a historical corpus for Old Occitan. Our approach transfers …

被引用次数：9 相关文章所有 7 个版本

[PDF] ufc.br

Utilização de informações lexicais extraídas automaticamente de corpora na análise sintática computacional do português

LFA Araripe - 2011 - repositorio.ufc.br

No desenvolvimento de analisadores sintáticos profundos para textos irrestritos, a principal
dificuldade a ser vencida é a modelação do léxico. Tradicionalmente, duas estratégias têm …

被引用次数：10 相关文章所有 6 个版本

[PDF] aelinco.es

POS-tagging a bilingual parallel corpus: methods and challenges

I Doval - Research in Corpus Linguistics, 2017 - ricl.aelinco.es

This paper reviews the author's experiences of tokenizing and POS tagging a bilingual
parallel corpus, the PaGeS Corpus, consisting mostly of German and Spanish fictional texts …

被引用次数：5 相关文章所有 2 个版本

高级搜索

QQ 群