An overview of end-to-end entity resolution for big data

V Christophides, V Efthymiou, T Palpanas… - ACM Computing …, 2020 - dl.acm.org
One of the most critical tasks for improving data quality and increasing the reliability of data
analytics is Entity Resolution (ER), which aims to identify different descriptions that refer to …

Blocking and filtering techniques for entity resolution: A survey

G Papadakis, D Skoutas, E Thanos… - ACM Computing Surveys …, 2020 - dl.acm.org
Entity Resolution (ER), a core task of Data Integration, detects different entity profiles that
correspond to the same real-world object. Due to its inherently quadratic complexity, a series …

Data and information quality

C Batini, M Scannapieco - Cham, Switzerland: Springer International …, 2016 - Springer
This book is the result of a study path that started in 2006, when the two authors of this book
published the book Data Quality: Concepts, Methodologies and Techniques. After 8 years …

[图书][B] The data matching process

P Christen, P Christen - 2012 - Springer
This chapter provides an overview of the data matching process, and describes the five
major steps involved in this process: data pre-processing (cleaning and standardisation) …

A survey of indexing techniques for scalable record linkage and deduplication

P Christen - IEEE transactions on knowledge and data …, 2011 - ieeexplore.ieee.org
Record linkage is the process of matching records from several databases that refer to the
same entities. When applied on a single database, this process is known as deduplication …

Data-Centric Systems and Applications

MJ Carey, S Ceri, P Bernstein, U Dayal, C Faloutsos… - Italy: Springer, 2006 - Springer
The rapid growth of the Web in the past two decades has made it the largest publicly
accessible data source in the world. Web mining aims to discover useful information or …

A survey on blocking technology of entity resolution

BH Li, Y Liu, AM Zhang, WH Wang, S Wan - Journal of Computer Science …, 2020 - Springer
Entity resolution (ER) is a significant task in data integration, which aims to detect all entity
profiles that correspond to the same real-world entity. Due to its inherently quadratic …

A taxonomy of privacy-preserving record linkage techniques

D Vatsalan, P Christen, VS Verykios - Information Systems, 2013 - Elsevier
The process of identifying which records in two or more databases correspond to the same
entity is an important aspect of data quality activities such as data pre-processing and data …

[图书][B] The four generations of entity resolution

Entity Resolution (ER) lies at the core of data integration and cleaning and, thus, a bulk of
the research examines ways for improving its effectiveness and time efficiency. The initial …

Comparative analysis of approximate blocking techniques for entity resolution

G Papadakis, J Svirsky, A Gal, T Palpanas - Proceedings of the VLDB …, 2016 - dl.acm.org
Entity Resolution is a core task for merging data collections. Due to its quadratic complexity,
it typically scales to large volumes of data through blocking: similar entities are clustered into …