作者
João Gama, Shazia Tabassum
发表日期
2021
简介
Topic modeling or inference has been one of the well-known problems in the area of text mining. It deals with the automatic categorisation of words or documents into similarity groups also known as topics. In most of the social media platforms such as Twitter, Instagram, and Facebook, hashtags are used to define the content of posts. Therefore, modelling of hashtags helps in categorising posts as well as analysing user preferences. In this work, we tried to address this problem involving hashtags that stream in real-time. Our approach encompasses graph of hashtags, dynamic sampling and modularity based community detection over the data from a popular social media engagement application. Further, we analysed the topic clusters' structure and quality using empirical experiments. The results unveil latent semantic relations between hashtags and also show frequent hashtags in a cluster. Moreover, in this approach, the words in different languages are treated synonymously. Besides, we also observed top trending topics and correlated clusters.