[PDF][PDF] Text Categorization: A comparison of classifiers, feature selection metrics and document representation

F Peleja, GP Lopes, J Silva - Proceedings of the 15th Portuguese …, 2011 - researchgate.net
In this paper, we compare several aspects related to automatic text categorization which
include document representation, feature selection, three classifiers, and their application to …

Feature selection strategies for text categorization

P Soucy, GW Mineau - Advances in Artificial Intelligence: 16th Conference …, 2003 - Springer
Feature selection is an important research issue in text categorization. The reason for this is
that thousands of features are often involved, even when the simplest document …

[PDF][PDF] Experimenting N-Grams in Text Categorization.

A Rahmoun, Z Elberrichi - Int. Arab J. Inf. Technol., 2007 - academia.edu
This paper deals with automatic supervised classification of documents. The approach
suggested is based on a vector representation of the documents centred not on the words …

A simple feature selection method for text classification

P Soucy, GW Mineau - Proceedings of the 17th international joint …, 2001 - dl.acm.org
In text classification most techniques use bag-of-words to represent documents. The main
problem is to identify what words are best suited to classify the documents in such a way as …

[PDF][PDF] An evaluation of bag-of-concepts representations in automatic text classification

O Täckström - Recall, 2005 - Citeseer
Automatic text classification is the process of automatically classifying text documents into
pre-defined document classes. Traditionally, documents are represented in the so called …

Angular measures for feature selection in text categorization

EF Combarro, E Montanes, J Ranilla… - Proceedings of the 2006 …, 2006 - dl.acm.org
Text Categorization, which consists of automatically assigning documents to a set of
categories, usually involves the management of a huge number of features. Most of them are …

Best terms: an efficient feature-selection algorithm for text categorization

D Fragoudis, D Meretakis, S Likothanassis - Knowledge and Information …, 2005 - Springer
In this paper, we propose a new feature-selection algorithm for text classification, called best
terms (BT). The complexity of BT is linear in respect to the number of the training-set …

Using laplace and angular measures for feature selection in text categorisation

E Montanes, P Alonso, EF Combarro… - International …, 2008 - inderscienceonline.com
Text Categorisation (TC) consists of automatically assigning documents to a set of prefixed
categories. It usually involves the management of a huge number of features. Some of them …

Discriminative features for text document classification

K Torkkola - Formal Pattern Analysis & Applications, 2004 - Springer
The bag-of-words approach to text document representation typically results in vectors of the
order of 5000–20,000 components as the representation of documents. To make effective …

Using typical testors for feature selection in text categorization

A Pons-Porrata, R Gil-García… - Progress in Pattern …, 2007 - Springer
A major difficulty of text categorization problems is the high dimensionality of the feature
space. Thus, feature selection is often performed in order to increase both the efficiency and …