A survey on region extractors from web documents

HA Sleiman, R Corchuelo - IEEE Transactions on Knowledge …, 2012 - ieeexplore.ieee.org
Extracting information from web documents has become a research area in which new
proposals sprout out year after year. This has motivated several researchers to work on …

Trinity: on using trinary trees for unsupervised web data extraction

HA Sleiman, R Corchuelo - IEEE Transactions on Knowledge …, 2013 - ieeexplore.ieee.org
Web data extractors are used to extract data from web documents in order to feed automated
processes. In this article, we propose a technique that works on two or more web documents …

Box clustering segmentation: A new method for vision-based web page preprocessing

J Zeleny, R Burget, J Zendulka - Information Processing & Management, 2017 - Elsevier
This paper presents a novel approach to web page segmentation, which is one of
substantial preprocessing steps when mining data from web documents. Most of the current …

Learning to extract local events from the web

J Foley, M Bendersky, V Josifovski - … of the 38th International ACM SIGIR …, 2015 - dl.acm.org
The goal of this work is extraction and retrieval of local events from web pages. Examples of
local events include small venue concerts, theater performances, garage sales, movie …

Matching parse thickets for open domain question answering

B Galitsky - Data & Knowledge Engineering, 2017 - Elsevier
Traditional parse trees are combined together and enriched with anaphora and rhetoric
information to form a unified representation for a paragraph of text. We refer to these …

A methodology to learn ontological attributes from the Web

D Sánchez - Data & Knowledge Engineering, 2010 - Elsevier
Class descriptors such as attributes, features or meronyms are rarely considered when
developing ontologies. Even WordNet only includes a reduced amount of part-of …

Tex: An efficient and effective unsupervised web information extractor

HA Sleiman, R Corchuelo - Knowledge-Based Systems, 2013 - Elsevier
The World Wide Web is an immense information resource. Web information extraction is the
task that transforms human friendly Web information into structured information that can be …

Linear combination of component results in information retrieval

S Wu - Data & Knowledge Engineering, 2012 - Elsevier
In information retrieval, data fusion (also known as meta-search) has been investigated by
many researchers. Previous investigation and experimentation demonstrate that the linear …

AutoRM: An effective approach for automatic Web data record mining

S Shi, C Liu, Y Shen, C Yuan, Y Huang - Knowledge-Based Systems, 2015 - Elsevier
A Web database typically responds to a query with a Web page, which encodes the query
results into semi-structured data objects using HTML tags. We call such data objects Web …

Query recommendation for improving search engine results

HM Zahera, GF El-Hady… - … Retrieval Methods for …, 2013 - igi-global.com
As web contents grow, the importance of search engines become more critical and at the
same time user satisfaction decreases. Query recommendation is a new approach to …