Sentiment classification of the Slovenian news texts

J Bučar, J Povh, M Žnidaršič - … of the 9th International Conference on …, 2016 - Springer
J Bučar, J Povh, M Žnidaršič
Proceedings of the 9th International Conference on Computer Recognition …, 2016Springer
This paper deals with automatic two class document-level sentiment classification. We
retrieved textual documents with political, business, economic and financial content from five
Slovenian web media. By annotating a sample of 10,427 documents, we obtained a labelled
corpus in the Slovenian language. Five classifiers were evaluated on this corpus:
multinomial naïve Bayes, support vector machines, random forest, k-nearest neighbour and
naïve Bayes, out of which the first three were used also in the assessment of the pre …
Abstract
This paper deals with automatic two class document-level sentiment classification. We retrieved textual documents with political, business, economic and financial content from five Slovenian web media. By annotating a sample of 10,427 documents, we obtained a labelled corpus in the Slovenian language. Five classifiers were evaluated on this corpus: multinomial naïve Bayes, support vector machines, random forest, k-nearest neighbour and naïve Bayes, out of which the first three were used also in the assessment of the pre-processing options. Among the selected classifiers, multinomial naïve Bayes outperforms the naïve Bayes, k-nearest neighbour, random forest and support vector machines classifier in terms of classification accuracy. The best selection of pre-processing options achieves more than 95 % classification accuracy with Naïve Bayes Multinomial and more than 85 % with support vector machines and random forest classifier.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果