Improving text categorization with semantic knowledge in Wikipedia

X Wang, Y Jia, R Chen, H Fan… - IEICE TRANSACTIONS on …, 2013 - search.ieice.org
Text categorization, especially short text categorization, is a difficult and challenging task
since the text data is sparse and multidimensional. In traditional text classification methods …

Building semantic kernels for text classification using wikipedia

P Wang, C Domeniconi - Proceedings of the 14th ACM SIGKDD …, 2008 - dl.acm.org
Document classification presents difficult challenges due to the sparsity and the high
dimensionality of text data, and to the complex semantics of the natural language. The …

Using Wikipedia knowledge to improve text classification

P Wang, J Hu, HJ Zeng, Z Chen - Knowledge and Information Systems, 2009 - Springer
Text classification has been widely used to assist users with the discovery of useful
information from the Internet. However, traditional classification methods are based on the …

Exploiting term relationship to boost text classification

D Shen, J Wu, B Cao, JT Sun, Q Yang… - Proceedings of the 18th …, 2009 - dl.acm.org
Document classification provides an effective way to handle the explosive online textual
data. However, in practical classification settings, we face the so-called feature sparsity …

Exploiting Turkish Wikipedia as a semantic resource for text classification

M Poyraz, MC Ganiz, S Akyokuş… - … on Innovations in …, 2012 - ieeexplore.ieee.org
Majority of the existing text classification algorithms are based on the “bag of words”(BOW)
approach, in which the documents are represented as weighted occurrence frequencies of …

Wikipedia based short text classification method

J Li, Y Cai, Z Cai, H Leung, K Yang - … , Suzhou, China, March 27-30, 2017 …, 2017 - Springer
Short text is usually expressed in refined slightly, insufficient information, which makes text
classification difficult. But we can try to introduce some information from the existing …

Short text classification using wikipedia concept based document representation

X Wang, R Chen, Y Jia, B Zhou - … International Conference on …, 2013 - ieeexplore.ieee.org
Short text classification is a difficult and challenging task in information retrieval systems
since the text data is short, sparse and multidimensional. In this paper, we represent short …

Fast text categorization using concise semantic analysis

Z Li, Z Xiong, Y Zhang, C Liu, K Li - Pattern Recognition Letters, 2011 - Elsevier
Text representation is a necessary procedure for text categorization tasks. Currently, bag of
words (BOW) is the most widely used text representation method but it suffers from two …

Automatic Classification of Documents from Wikipedia

L Xiangdong, R Tao, L Kang - Data Analysis and …, 2017 - manu44.magtech.com.cn
[Objective] This paper aims to improve the performance of text classification systems with the
help of Wikipedia's feature expansion function.[Methods] First, we established the CDF max …

Utilizing wikipedia knowledge in open directory project-based text classification

HY Shin, GJ Lee, WJ Ryu, SK Lee - Proceedings of the Symposium on …, 2017 - dl.acm.org
Traditional Open Directory Project (ODP)-based text classification methods use bag-of-
words approach, which only utilizes single words in ODP documents and ignores important …