[PDF][PDF] Unsupervised multilingual learning for POS tagging

B Snyder, T Naseem, J Eisenstein… - Proceedings of the 2008 …, 2008 - aclanthology.org
We demonstrate the effectiveness of multilingual learning for unsupervised part-of-speech
tagging. The key hypothesis of multilingual learning is that by combining cues from multiple …

Natural language processing for similar languages, varieties, and dialects: A survey

M Zampieri, P Nakov, Y Scherrer - Natural Language Engineering, 2020 - cambridge.org
There has been a lot of recent interest in the natural language processing (NLP) community
in the computational processing of language varieties and dialects, with the aim to improve …

[PDF][PDF] Cross language POS taggers (and other tools) for Indian languages: An experiment with Kannada using Telugu resources

S Reddy, S Sharoff - … of the fifth international workshop on cross …, 2011 - aclanthology.org
Indian languages are known to have a large speaker base, yet some of these languages
have minimal or non-efficient linguistic resources. For example, Kannada is relatively …

[PDF][PDF] Natural language processing for resource-poor languages

L Duong - University of Melbourne, 2017 - minerva-access.unimelb.edu.au
Natural language processing (NLP) aims, broadly speaking, to teach computers to
understand human language. This is hard as the computer must comprehend many facets of …

Designing and evaluating a Russian tagset

S Sharoff, M Kopotev, T Erjavec, A Feldman… - 2008 - digitalcommons.montclair.edu
This paper reports the principles behind designing a tagset to cover Russian
morphosyntactic phenomena, modifications of the core tagset, and its evaluation. The tagset …

Practical Natural Language Processing for Low-Resource Languages.

BP King - 2015 - deepblue.lib.umich.edu
As the Internet and World Wide Web have continued to gain widespread adoption, the
linguistic diversity represented has also been growing. Simultaneously the field of …

[PDF][PDF] Automatic language identification for Persian and Dari texts

S Malmasi, M Dras - Proceedings of PACLING, 2015 - science.mq.edu.au
We present the first empirical study of distinguishing Persian and Dari texts at the sentence
level, using discriminative models. As Dari is a low-resourced language, we developed a …

[PDF][PDF] Simpler unsupervised POS tagging with bilingual projections

L Duong, P Cook, S Bird, P Pecina - … of the 51st Annual Meeting of …, 2013 - aclanthology.org
We present an unsupervised approach to part-of-speech tagging based on projections of
tags in a word-aligned bilingual parallel corpus. In contrast to the existing state-of-the-art …

Subdialectal differences in sorani kurdish

S Malmasi - Proceedings of the third workshop on nlp for similar …, 2016 - aclanthology.org
In this study we apply classification methods for detecting subdialectal differences in Sorani
Kurdish texts produced in different regions, namely Iran and Iraq. As Sorani is a low …

Adding more languages improves unsupervised multilingual part-of-speech tagging: A Bayesian non-parametric approach

B Snyder, T Naseem, J Eisenstein, R Barzilay - 2009 - dspace.mit.edu
We investigate the problem of unsupervised part-of-speech tagging when raw parallel data
is available in a large number of languages. Patterns of ambiguity vary greatly across …