Structured databases on the web: Observations and implications

KCC Chang, B He, C Li, M Patel, Z Zhang - Acm Sigmod Record, 2004 - dl.acm.org
The Web has been rapidly" deepened" by the prevalence of databases online. With the
potentially unlimited information hidden behind their query interfaces, this" deep Web" of …

Building efficient and effective metasearch engines

W Meng, C Yu, KL Liu - ACM Computing Surveys (CSUR), 2002 - dl.acm.org
Frequently a user's information needs are stored in the databases of multiple search
engines. It is inconvenient and inefficient for an ordinary user to invoke multiple search …

Introduction to the special issue on the web as corpus

A Kilgarriff, G Grefenstette - Computational linguistics, 2003 - direct.mit.edu
Introduction to the Special Issue on the Web as Corpus Page 1 c© 2003 Association for
Computational Linguistics Introduction to the Special Issue on the Web as Corpus Adam …

[PDF][PDF] Deep Web 数据集成研究综述

刘伟, 孟小峰, 孟卫一 - 计算机学报, 2007 - c.xml.org.cn
As the rapid development of World Wide Web, there is tremendous information" hiddened" in
Deep Web, and its capacity is increasing rapidly. The information can only be accessed by …

聚焦爬虫技术研究综述

周立柱, 林玲 - 计算机应用, 2005 - joca.cn
因特网的迅速发展对万维网信息的查找与发现提出了巨大的挑战. 对于大多用户提出的与主题或
领域相关的查询需求, 传统的通用搜索引擎往往不能提供令人满意的结果网页 …

Form-based ontology creation and information harvesting

DW Embley, C Tao, SW Liddle - US Patent 8,103,962, 2012 - Google Patents
Extracting data from web pages. User input is received defining a tabular form. User input is
received correlating portions of the form with user selected data items contained in one or …

System for automatically generating queries

GT Grefenstette, JG Shanahan - US Patent 6,778,979, 2004 - Google Patents
WOCABULARY--of information in the information retrieval system. The categorizer assigns
the Selected document content a classi fication label from the organized classification of …

Document-centric system with auto-completion

JG Shanahan, GT Grefenstette - US Patent 6,820,075, 2004 - Google Patents
An information space is created using a document. Entities from the document and its
information space are used to create a database of entities. An auto-completion system uses …

Personalized web search for improving retrieval effectiveness

F Liu, C Yu, W Meng - IEEE Transactions on knowledge and …, 2004 - ieeexplore.ieee.org
Current Web search engines are built to serve all users, independent of the special needs of
any individual user. Personalization of Web search is to carry out retrieval for each user …

Evaluation measures for hierarchical classification: a unified view and novel approaches

A Kosmopoulos, I Partalas, E Gaussier… - Data Mining and …, 2015 - Springer
Hierarchical classification addresses the problem of classifying items into a hierarchy of
classes. An important issue in hierarchical classification is the evaluation of different …