Information extraction for search engines using fast heuristic techniques

HA Sleiman, R Corchuelo - IEEE Transactions on Knowledge …, 2012 - ieeexplore.ieee.org

Extracting information from web documents has become a research area in which new
proposals sprout out year after year. This has motivated several researchers to work on …

被引用次数：140 相关文章所有 10 个版本

Trinity: on using trinary trees for unsupervised web data extraction

HA Sleiman, R Corchuelo - IEEE Transactions on Knowledge …, 2013 - ieeexplore.ieee.org

Web data extractors are used to extract data from web documents in order to feed automated
processes. In this article, we propose a technique that works on two or more web documents …

被引用次数：95 相关文章所有 6 个版本

Box clustering segmentation: A new method for vision-based web page preprocessing

J Zeleny, R Burget, J Zendulka - Information Processing & Management, 2017 - Elsevier

This paper presents a novel approach to web page segmentation, which is one of
substantial preprocessing steps when mining data from web documents. Most of the current …

被引用次数：50 相关文章所有 3 个版本

[PDF] acm.org

Learning to extract local events from the web

J Foley, M Bendersky, V Josifovski - … of the 38th International ACM SIGIR …, 2015 - dl.acm.org

The goal of this work is extraction and retrieval of local events from web pages. Examples of
local events include small venue concerts, theater performances, garage sales, movie …

被引用次数：62 相关文章所有 12 个版本

Matching parse thickets for open domain question answering

B Galitsky - Data & Knowledge Engineering, 2017 - Elsevier

Traditional parse trees are combined together and enriched with anaphora and rhetoric
information to form a unified representation for a paragraph of text. We refer to these …

被引用次数：50 相关文章所有 3 个版本

[PDF] researchgate.net

A methodology to learn ontological attributes from the Web

D Sánchez - Data & Knowledge Engineering, 2010 - Elsevier

Class descriptors such as attributes, features or meronyms are rarely considered when
developing ontologies. Even WordNet only includes a reduced amount of part-of …

被引用次数：97 相关文章所有 5 个版本

Tex: An efficient and effective unsupervised web information extractor

HA Sleiman, R Corchuelo - Knowledge-Based Systems, 2013 - Elsevier

The World Wide Web is an immense information resource. Web information extraction is the
task that transforms human friendly Web information into structured information that can be …

被引用次数：69 相关文章所有 5 个版本

Linear combination of component results in information retrieval

S Wu - Data & Knowledge Engineering, 2012 - Elsevier

In information retrieval, data fusion (also known as meta-search) has been investigated by
many researchers. Previous investigation and experimentation demonstrate that the linear …

被引用次数：47 相关文章所有 5 个版本

AutoRM: An effective approach for automatic Web data record mining

S Shi, C Liu, Y Shen, C Yuan, Y Huang - Knowledge-Based Systems, 2015 - Elsevier

A Web database typically responds to a query with a Web page, which encodes the query
results into semi-structured data objects using HTML tags. We call such data objects Web …

被引用次数：36 相关文章所有 4 个版本

[PDF] researchgate.net

Query recommendation for improving search engine results

HM Zahera, GF El-Hady… - … Retrieval Methods for …, 2013 - igi-global.com

As web contents grow, the importance of search engines become more critical and at the
same time user satisfaction decreases. Query recommendation is a new approach to …

被引用次数：51 相关文章所有 17 个版本

高级搜索

QQ 群