Opinion-mining on marglish and devanagari comments of youtube cookery channels using parametric and non-parametric learning models

SR Shah, A Kaushik, S Sharma, J Shah - Big Data and Cognitive …, 2020 - mdpi.com
SR Shah, A Kaushik, S Sharma, J Shah
Big Data and Cognitive Computing, 2020mdpi.com
YouTube is a boon, and through it people can educate, entertain, and express themselves
about various topics. YouTube India currently has millions of active users. As there are
millions of active users it can be understood that the data present on the YouTube will be
large. With India being a very diverse country, many people are multilingual. People express
their opinions in a code-mix form. Code-mix form is the mixing of two or more languages. It
has become a necessity to perform Sentiment Analysis on the code-mix languages as there …
YouTube is a boon, and through it people can educate, entertain, and express themselves about various topics. YouTube India currently has millions of active users. As there are millions of active users it can be understood that the data present on the YouTube will be large. With India being a very diverse country, many people are multilingual. People express their opinions in a code-mix form. Code-mix form is the mixing of two or more languages. It has become a necessity to perform Sentiment Analysis on the code-mix languages as there is not much research on Indian code-mix language data. In this paper, Sentiment Analysis (SA) is carried out on the Marglish (Marathi + English) as well as Devanagari Marathi comments which are extracted from the YouTube API from top Marathi channels. Several machine-learning models are applied on the dataset along with 3 different vectorizing techniques. Multilayer Perceptron (MLP) with Count vectorizer provides the best accuracy of 62.68% on the Marglish dataset and Bernoulli Naïve Bayes along with the Count vectorizer, which gives accuracy of 60.60% on the Devanagari dataset. Multilayer Perceptron and Bernoulli Naïve Bayes are considered to be the best performing algorithms. 10-fold cross-validation and statistical testing was also carried out on the dataset to confirm the results.
MDPI
以上显示的是最相近的搜索结果。 查看全部搜索结果