This chapter provides an overview of the data matching process, and describes the five major steps involved in this process: data pre-processing (cleaning and standardisation) …
The rapid growth of the Web in the past two decades has made it the largest publicly accessible data source in the world. Web mining aims to discover useful information or …
Sensitive personal data are created in many application domains, and there is now an increasing demand to share, integrate, and link such data within and across organisations in …
Whether the goal is to estimate the number of people that live in a congressional district, to estimate the number of individuals that have died in an armed conflict, or to disambiguate …
L Gu, R Baxter, D Vickers, C Rainsford - CSIRO Mathematical and …, 2003 - Citeseer
Record linkage is the task of quickly and accurately identifying records corresponding to the same entity from one or more data sources. Record linkage is also known as data cleaning …
We discuss Bayesian approaches to multiple comparison problems, using a decision theoretic perspective to critically compare competing approaches. We set up decision …
This book brings together a collection of articles on statistical methods relating to missing data analysis, including multiple imputation, propensity scores, instrumental variables, and …
Data files and codes. Included in the supplementary material there are the following files: exampleA. dat, exampleB. dat and exampleV. dat contain the data used in Section 5. The …
M Sadinle - Journal of the American Statistical Association, 2017 - Taylor & Francis
The bipartite record linkage task consists of merging two disparate datafiles containing information on two overlapping sets of entities. This is nontrivial in the absence of unique …