查看文章

sciencedirect.com 中的 [HTML]

Unicode-8 based linguistics data set of annotated Sindhi text

作者

Mazhar Ali Dootio, Asim Imdad Wagan

发表日期

2018/8/1

期刊

Data in brief

卷号

页码范围

1504-1514

出版商

Elsevier

简介

Sindhi Unicode-8 based linguistics data set is multi-class and multi-featured data set. It is developed to solve the natural languages processing (NLP) and linguistics problems of Sindhi language. The data set presents information on grammatical and morphological structure of Sindhi language text as well as sentiment polarity of Sindhi lexicons. Therefore, data set may be used for information retrieving, machine translation, lexicon analysis, language modeling analysis, grammatical and morphological analysis, Semantic and sentiment analysis.

引用总数

被引用次数：10

2019202020212022202320242 1 2 2 2 1

学术搜索中的文章

Unicode-8 based linguistics data set of annotated Sindhi text

MA Dootio, AI Wagan - Data in brief, 2018

被引用次数：10 相关文章所有 7 个版本