Ontology population and enrichment: State of the art

G Petasis, V Karkaletsis, G Paliouras, A Krithara… - … extraction and ontology …, 2011 - Springer
Ontology learning is the process of acquiring (constructing or integrating) an ontology (semi-
) automatically. Being a knowledge acquisition task, it is a complex activity, which becomes …

Duplicate record detection: A survey

AK Elmagarmid, PG Ipeirotis… - IEEE Transactions on …, 2006 - ieeexplore.ieee.org
Often, in the real world, entities have two or more representations in databases. Duplicate
records do not share a common key and/or they contain errors that make duplicate matching …

Real-world data is dirty: Data cleansing and the merge/purge problem

MA Hernández, SJ Stolfo - Data mining and knowledge discovery, 1998 - Springer
The problem of merging multiple databases of information about common entities is
frequently encountered in KDD and decision support applications in large commercial and …

The merge/purge problem for large databases

MA Hernández, SJ Stolfo - ACM Sigmod Record, 1995 - dl.acm.org
Many commercial organizations routinely gather large numbers of databases for various
marketing and business analysis functions. The task is to correlate information from different …

Overview and framework for data and information quality research

SE Madnick, RY Wang, YW Lee, H Zhu - Journal of data and information …, 2009 - dl.acm.org
Awareness of data and information quality issues has grown rapidly in light of the critical role
played by the quality of information in our data-intensive, knowledge-based economy …

Method and apparatus for imaging, image processing and data compression

SJ Stolfo - US Patent 5,748,780, 1998 - Google Patents
[57] ABSTRACT A method for processing an image, consisting of a fore ground and a
background, to produce a highly compressed and accurate representation of the image …

Learning object identification rules for information integration

S Tejada, CA Knoblock, S Minton - Information Systems, 2001 - Elsevier
When integrating information from multiple websites, the same data objects can exist in
inconsistent text formats across sites, making it difficult to identify matching objects using …

[引用][C] Data quality

RY Wang - 2001 - books.google.com
Data Quality provides an exposé of research and practice in the data quality field for
technically oriented readers. It is based on the research conducted at the MIT Total Data …

[PDF][PDF] Record linkage: Current practice and future directions

L Gu, R Baxter, D Vickers, C Rainsford - CSIRO Mathematical and …, 2003 - Citeseer
Record linkage is the task of quickly and accurately identifying records corresponding to the
same entity from one or more data sources. Record linkage is also known as data cleaning …

[图书][B] Fundamentals of relational database management systems

S Sumathi, S Esakkirajan - 2007 - books.google.com
Information is a valuable resource to an organization. Computer software provides an
efficient means of processing information, and database systems are becoming an …