Unsupervised feature generation using knowledge repositories for effective text categorization

R Prasath, S Sarkar - ECAI 2010, 2010 - ebooks.iospress.nl
We propose an unsupervised feature generation algorithm using the repositories of human
knowledge for effective text categorization. Conventional bag of words (BOW) depends on …

[PDF][PDF] Contributions to automatic knowledge extraction from unstructured data

ID Morariu, NV Lucian - 2007 - webspace.ulbsibiu.ro
There are an increasing number of online documents and an automated document
classification is an important challenge. It is essential to be able to automatically organize …

Using Wikipedia knowledge to improve text classification

P Wang, J Hu, HJ Zeng, Z Chen - Knowledge and Information Systems, 2009 - Springer
Text classification has been widely used to assist users with the discovery of useful
information from the Internet. However, traditional classification methods are based on the …

Text classification using web corpora and em algorithms

CM Hung, LF Chien - Asia Information Retrieval Symposium, 2004 - Springer
The insufficiency and irrelevancy of training corpora is always the main task to overcome
while doing text classification. This paper proposes a Web-based text classification …

Domain concept handling in automated text categorization

Y Liu, HT Loh - 2010 5th IEEE Conference on Industrial …, 2010 - ieeexplore.ieee.org
Single term based document representations, eg bag-of-words, have been widely accepted
in the machine learning, information retrieval and text mining community. One notable …

KACTL: knowware based automated construction of a treelike library from web documents

R Lu, Y Huang, K Sun, Z Chen, Y Chen… - … Conference on Web …, 2012 - Springer
This paper proposed a knowware based supervised machine learning technique for domain
specific regression and classification of Web documents. It is simple because it is only based …

A sample extension method based on Wikipedia and its application in text classification

W Zhu, Y Liu, G Hu, J Ni, Z Lu - Wireless Personal Communications, 2018 - Springer
Text classification is a topic in natural language processing that is particularly useful for
Internet information processing. Methods based on supervised learning require a large …

A feature selection for text categorization on research support system Papits

T Ozono, T Shintani, T Ito, T Hasegawa - … , New Zealand, August 9-13, 2004 …, 2004 - Springer
We have developed a research support system, called Papits, that shares research
information, such as PDF files of research papers, in computers on the network and …

[PDF][PDF] Domain ontology guided feature-selection for document categorization

BB Wang, RB McKay, HA Abbass, M Barlow - Aust J Intell Inf Process Syst, 2001 - Citeseer
We present a novel method employing a hierarchical domain ontology structure to select
features representing documents. All raw words in the training documents are mapped to …

A new text representation scheme combining bag-of-words and bag-of-concepts approaches for automatic text classification

A Alahmadi, A Joorabchi… - 2013 7th IEEE GCC …, 2013 - ieeexplore.ieee.org
This paper introduces a new approach to creating text representations and apply it to a
standard text classification collections. The approach is based on supplementing the well …