P Náther - Comenius University, Bratislava, Slovakia, 2005 - Citeseer
We live in the world where information have a great value and the amount of available information (mostly on internet) has been expansively growing during last years. There are …
Most text categorization algorithms represent a document collection as a Bag of Words (BOW). The BOW representation is unable to recognize synonyms from a given term set and …
Traditionally, text classifiers are built from labeled training examples. Labeling is usually done manually by human experts (or the users), which is a labor intensive and time …
Text classification is becoming more and more important with the rapid growth of on-line information available. This paper describes the text classification process. Of course, a …
Automated text classification has been considered as a vital method to manage and process a vast amount of documents in digital forms that are widespread and continuously …
Most of the text categorization algorithms in the literature represent documents as collections of words. An alternative which has not been sufficiently explored is the use of word …
DD Lewis - Speech and Natural Language: Proceedings of a …, 1990 - aclanthology.org
The way in which text is represented has a strong impact on the performance of text classification (retrieval and categorization) systems. We discuss the operation of text …
The bag-of-words approach to text document representation typically results in vectors of the order of 5000–20,000 components as the representation of documents. To make effective …