[PDF][PDF] Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model.

D Isa, LH Lee, VP Kallimani, R Rajkumar - Comput. Inf. Sci., 2008 - core.ac.uk
This work utilizes the Bayes formula to vectorize a document according to a probability
distribution based on keywords reflecting the probable categories that the document may …

Text document preprocessing with the Bayes formula for classification using the support vector machine

D Isa, LH Lee, VP Kallimani… - IEEE Transactions on …, 2008 - ieeexplore.ieee.org
This work implements an enhanced hybrid classification method through the utilization of the
naïve Bayes classifier and the Support Vector Machine (SVM). In this project, the Bayes …

[PDF][PDF] Multi-class document classification using support vector machine (SVM) based on improved Naïve bayes vectorization technique

HT Sueno, BD Gerardo, RP Medina - International Journal of …, 2020 - researchgate.net
At present several vectorization approaches are used to transform text documents into a
numerical format. A huge number of features converted from text data from a single …

Using the self organizing map for clustering of text documents

D Isa, VP Kallimani, LH Lee - Expert Systems with Applications, 2009 - Elsevier
An increasing number of computational and statistical approaches have been used for text
classification, including nearest-neighbor classification, naïve Bayes classification, support …

Hidden markov models for text categorization in multi-page documents

P Frasconi, G Soda, A Vullo - Journal of Intelligent Information Systems, 2002 - Springer
In the traditional setting, text categorization is formulated as a concept learning problem
where each instance is a single isolated document. However, this perspective is not …

Beyond vector space model for hierarchical Arabic text classification: A Markov chain approach

FS Al-Anzi, D AbuZeina - Information Processing & Management, 2018 - Elsevier
The vector space model (VSM) is a textual representation method that is widely used in
documents classification. However, it remains to be a space-challenging problem. One …

A probabilistic framework for the hierarchic organisation and classification of document collections

A Vinokourov, M Girolami - Journal of intelligent information systems, 2002 - Springer
This paper presents a probabilistic mixture modeling framework for the hierarchic
organisation of document collections. It is demonstrated that the probabilistic corpus model …

Document classification: an approach using feature clustering

BS Harish, B Udayasri - … Advances in Intelligent Informatics: Proceedings of …, 2014 - Springer
In this paper, we propose a new method of representing text documents based on feature
clustering approach. The proposed representation method is very powerful in reducing the …

Improving performance of text categorization by combining filtering and support vector machines

I Díaz, J Ranilla, E Montañes… - Journal of the …, 2004 - Wiley Online Library
Text Categorization is the process of assigning documents to a set of previously fixed
categories. A lot of research is going on with the goal of automating this time‐consuming …

Feature selection using support vector machines

J Brank, M Grobelnik, N Milic-Frayling… - WIT Transactions on …, 2002 - witpress.com
Text categorization is the task of classifying natural language documents into a set of
predefine categories. Documents are typically represented by sparse vectors under the …