作者
Hadi Amiri, Hosein Hojjat, Farhad Oroumchian
发表日期
2007/2
期刊
Proceedings of the 12th International CSI Computer Conference (CSICC)
简介
One of the fundamental works in natural language processing is creating a feasible corpus for evaluating effectiveness of different algorithms. In this paper, the authors report creation of test corpus of automatic part of speech tagging purposes based on the Persian tagged corpus of Prof. Bijankhan. This study includes preprocessing, statistical analysis and experiments with simple statistical POS tagging methods done on this corpus. Part of speech tagging experimental results show that even with a simple POS method such as Maximum Likelihood Estimation (MLE), we could reach to an get acceptable accuracy of 93.16 percent.
引用总数
200820092010201120122013201420152016201720182019202020212022202320242352214112121
学术搜索中的文章
H Amiri, H Hojjat, F Oroumchian - Proceedings of the 12th International CSI Computer …, 2007