Privacy-preserving record linkage for big data: Current approaches and research challenges

D Vatsalan, Z Sehili, P Christen, E Rahm - Handbook of big data …, 2017 - Springer
Abstract The growth of Big Data, especially personal data dispersed in multiple data
sources, presents enormous opportunities and insights for businesses to explore and …

Administrative data linkage in Brazil: potentials for health technology assessment

MS Ali, MY Ichihara, LC Lopes, GCG Barbosa… - Frontiers in …, 2019 - frontiersin.org
Health technology assessment (HTA) is the systematic evaluation of the properties and
impacts of health technologies and interventions. In this article, we presented a discussion of …

[图书][B] The data matching process

P Christen, P Christen - 2012 - Springer
This chapter provides an overview of the data matching process, and describes the five
major steps involved in this process: data pre-processing (cleaning and standardisation) …

Data-Centric Systems and Applications

MJ Carey, S Ceri, P Bernstein, U Dayal, C Faloutsos… - Italy: Springer, 2006 - Springer
The rapid growth of the Web in the past two decades has made it the largest publicly
accessible data source in the world. Web mining aims to discover useful information or …

Linking sensitive data

P Christen, T Ranbaduge, R Schnell - Methods and techniques for …, 2020 - Springer
Sensitive personal data are created in many application domains, and there is now an
increasing demand to share, integrate, and link such data within and across organisations in …

A comparison of personal name matching: Techniques and practical issues

P Christen - Sixth IEEE International Conference on Data …, 2006 - ieeexplore.ieee.org
Finding and matching personal names is at the core of an increasing number of
applications: from text and Web mining, search engines, to information extraction …

Febrl- an open source data cleaning, deduplication and record linkage system with a graphical user interface

P Christen - Proceedings of the 14th ACM SIGKDD international …, 2008 - dl.acm.org
Matching records that refer to the same entity across data-bases is becoming an
increasingly important part of many data mining projects, as often data from multiple sources …

Quality and complexity measures for data linkage and deduplication

P Christen, K Goiser - Quality measures in data mining, 2007 - Springer
Deduplicating one data set or linking several data sets are increasingly important tasks in
the data preparation steps of many data mining projects. The aim of such linkages is to …

CIDACS-RL: a novel indexing search and scoring-based record linkage system for huge datasets with high accuracy and scalability

GCG Barbosa, MS Ali, B Araujo, S Reis, S Sena… - BMC medical informatics …, 2020 - Springer
Background Record linkage is the process of identifying and combining records about the
same individual from two or more different datasets. While there are many open source and …

Multi-pass sorted neighborhood blocking with mapreduce

L Kolb, A Thor, E Rahm - Computer Science-Research and Development, 2012 - Springer
Cloud infrastructures enable the efficient parallel execution of data-intensive tasks such as
entity resolution on large datasets. We investigate challenges and possible solutions of …