[PDF][PDF] Deep Web 数据集成研究综述

刘伟, 孟小峰, 孟卫一 - 计算机学报, 2007 - c.xml.org.cn
As the rapid development of World Wide Web, there is tremendous information" hiddened" in
Deep Web, and its capacity is increasing rapidly. The information can only be accessed by …

[图书][B] Web data mining: exploring hyperlinks, contents, and usage data

B Liu - 2011 - Springer
Liu has written a comprehensive text on Web mining, which consists of two parts. The first
part covers the data mining and machine learning foundations, where all the essential …

Data-Centric Systems and Applications

MJ Carey, S Ceri, P Bernstein, U Dayal, C Faloutsos… - Italy: Springer, 2006 - Springer
The rapid growth of the Web in the past two decades has made it the largest publicly
accessible data source in the world. Web mining aims to discover useful information or …

[图书][B] XML in a nutshell: a desktop quick reference

ER Harold, WS Means - 2004 - books.google.com
If you're a developer working with XML, you know there's a lot to know about XML, and the
XML space is evolving almost moment by moment. But you don't need to commit every XML …

Automatic web news extraction using tree edit distance

DDC Reis, PB Golgher, AS Silva… - Proceedings of the 13th …, 2004 - dl.acm.org
The Web poses itself as the largest data repository ever available in the history of
humankind. Major efforts have been made in order to provide efficient access to relevant …

Structured data extraction from the web based on partial tree alignment

Y Zhai, B Liu - IEEE Transactions on Knowledge and Data …, 2006 - ieeexplore.ieee.org
This paper studies the problem of structured data extraction from arbitrary Web pages. The
objective of the proposed research is to automatically segment data records in a page …

Automatic information extraction from large websites

V Crescenzi, G Mecca - Journal of the ACM (JACM), 2004 - dl.acm.org
Information extraction from websites is nowadays a relevant problem, usually performed by
software modules called wrappers. A key requirement is that the wrapper generation …

Using the structure of web sites for automatic segmentation of tables

K Lerman, L Getoor, S Minton, C Knoblock - Proceedings of the 2004 …, 2004 - dl.acm.org
Many Web sites, especially those that dynamically generate HTML pages to display the
results of a user's query, present information in the form of list or tables. Current tools that …

An information foraging theory perspective on tools for debugging, refactoring, and reuse tasks

SD Fleming, C Scaffidi, D Piorkowski… - ACM Transactions on …, 2013 - dl.acm.org
Theories of human behavior are an important but largely untapped resource for software
engineering research. They facilitate understanding of human developers' needs and …

Annotating search results from web databases

Y Lu, H He, H Zhao, W Meng… - IEEE transactions on …, 2011 - ieeexplore.ieee.org
An increasing number of databases have become web accessible through HTML form-
based search interfaces. The data units returned from the underlying database are usually …