Machine learning in automated text categorization

F Sebastiani - ACM computing surveys (CSUR), 2002 - dl.acm.org
The automated categorization (or classification) of texts into predefined categories has
witnessed a booming interest in the last 10 years, due to the increased availability of …

Web page classification: Features and algorithms

X Qi, BD Davison - ACM computing surveys (CSUR), 2009 - dl.acm.org
Classification of Web page content is essential to many tasks in Web information retrieval
such as maintaining Web directories and focused crawling. The uncontrolled nature of Web …

[图书][B] The text mining handbook: advanced approaches in analyzing unstructured data

R Feldman, J Sanger - 2007 - books.google.com
Text mining is a new and exciting area of computer science research that tries to solve the
crisis of information overload by combining techniques from data mining, machine learning …

Computational intelligence and feature selection: rough and fuzzy approaches

R Jensen, Q Shen - 2008 - books.google.com
The rough and fuzzy set approaches presented here open up many new frontiers for
continued research and development Computational Intelligence and Feature Selection …

Using web structure for classifying and describing web pages

EJ Glover, K Tsioutsiouliklis, S Lawrence… - Proceedings of the 11th …, 2002 - dl.acm.org
The structure of the web is increasingly being used to improve organization, search, and
analysis of information on the web. For example, Google uses the text in citing documents …

Web-page classification through summarization

D Shen, Z Chen, Q Yang, HJ Zeng, B Zhang… - Proceedings of the 27th …, 2004 - dl.acm.org
Web-page classification is much more difficult than pure-text classification due to a large
variety of noisy information embedded in Web pages. In this paper, we propose a new Web …

[PDF][PDF] A tutorial on automated text categorisation

F Sebastiani - Proceedings of ASAI-99, 1st Argentinian …, 1999 - backup.blackwinter.de
The automated categorisation (or classification) of texts into topical categories has a long
history, dating back at least to 1960. Until the late'80s, the dominant approach to the problem …

Verifying relevance between keywords and web site contents

B Zhang, HJ Zeng, Z Chen, WY Ma, L Li, Y Li… - US Patent …, 2007 - Google Patents
Abstract Systems and methods for verifying relevance between terms and Web site contents
are described. In one aspect, site contents from a bid URL are retrieved. Expanded term (s) …

[PDF][PDF] Combining rough and fuzzy sets for feature selection

R Jensen - 2005 - academia.edu
Feature selection (FS) refers to the problem of selecting those input attributes that are most
predictive of a given outcome; a problem encountered in many areas such as machine …

Web mining–concepts, applications and research directions

T Srivastava, P Desikan, V Kumar - Foundations and advances in data …, 2005 - Springer
From its very beginning, the potential of extracting valuable knowledge from the Web has
been quite evident. Web mining, ie the application of data mining techniques to extract …