查看文章

utas.edu.au 中的 [PDF]

Elimination of redundant information for Web data mining

作者

Shakirah Mohd Taib, Soon-Ja Yeom, Byeong-Ho Kang

发表日期

2005/4/4

研讨会论文

International Conference on Information Technology: Coding and Computing (ITCC'05)-Volume II

卷号

页码范围

200-205

出版商

IEEE

简介

These days, billions of Web pages are created with HTML or other markup languages. They only have a few uniform structures and contain various authoring styles compared to traditional text-based documents. However, users usually focus on a particular section of the page that presents the most relevant information to their interest. Therefore, Web documents classification needs to group and filter the pages based on their contents and relevant information. Many researches on Web mining report on mining Web structure and extracting information from Web contents. However, they have focused on detecting tables that convey specific data, not the tables that are used as a mechanism for structuring the layout of Web pages. Case modeling of tables can be constructed based on structure abstraction. Furthermore, Ripple Down Rules (RDR) is used to implement knowledge organization and construction, because it …

引用总数

被引用次数：9

200620072008200920102011201220131 1 1 2 1 2 1

学术搜索中的文章

Elimination of redundant information for Web data mining

SM Taib, SJ Yeom, BH Kang - … Conference on Information Technology: Coding and …, 2005

被引用次数：9 相关文章所有 8 个版本