作者
Anukarsh G Prasad, S Sanjana, Skanda M Bhat, BS Harish
发表日期
2017/10/21
研讨会论文
2017 2nd International Conference on Knowledge Engineering and Applications (ICKEA)
页码范围
1-5
出版商
IEEE
简介
The growth of social media has been exponential in the recent years. Immense amount of data is being put out onto the public domain through social media. This huge publicly available data can be used for research and a variety of applications. The objective of this paper is to counter problems with the social media dataset, namely : short text nature - the limited quantity of text data (140 to 160 characters), continuous streaming nature, usage of short forms and modern slangs and increasing use of sarcasm in messages and posts. Sarcastic tweets can mislead data mining activities and result in wrong classification. This paper compares various classification algorithms such as Random Forest, Gradient Boosting, Decision Tree, Adaptive Boost, Logistic Regression and Gaussian Naïve Bayes to detect sarcasm in tweets from the Twitter Streaming API. The best classifier is chosen and paired with various pre …
引用总数
2018201920202021202220232024315142017164
学术搜索中的文章
AG Prasad, S Sanjana, SM Bhat, BS Harish - 2017 2nd International Conference on Knowledge …, 2017