There has been a lot of recent interest in the natural language processing (NLP) community in the computational processing of language varieties and dialects, with the aim to improve …
S Reddy, S Sharoff - … of the fifth international workshop on cross …, 2011 - aclanthology.org
Indian languages are known to have a large speaker base, yet some of these languages have minimal or non-efficient linguistic resources. For example, Kannada is relatively …
L Duong - University of Melbourne, 2017 - minerva-access.unimelb.edu.au
Natural language processing (NLP) aims, broadly speaking, to teach computers to understand human language. This is hard as the computer must comprehend many facets of …
This paper reports the principles behind designing a tagset to cover Russian morphosyntactic phenomena, modifications of the core tagset, and its evaluation. The tagset …
As the Internet and World Wide Web have continued to gain widespread adoption, the linguistic diversity represented has also been growing. Simultaneously the field of …
S Malmasi, M Dras - Proceedings of PACLING, 2015 - science.mq.edu.au
We present the first empirical study of distinguishing Persian and Dari texts at the sentence level, using discriminative models. As Dari is a low-resourced language, we developed a …
We present an unsupervised approach to part-of-speech tagging based on projections of tags in a word-aligned bilingual parallel corpus. In contrast to the existing state-of-the-art …
S Malmasi - Proceedings of the third workshop on nlp for similar …, 2016 - aclanthology.org
In this study we apply classification methods for detecting subdialectal differences in Sorani Kurdish texts produced in different regions, namely Iran and Iraq. As Sorani is a low …
We investigate the problem of unsupervised part-of-speech tagging when raw parallel data is available in a large number of languages. Patterns of ambiguity vary greatly across …