Efficient web-based data imputation with graph model

Y Tang, H Wang, S Zhang, H Zhang, R Shi - Database Systems for …, 2017 - Springer
Y Tang, H Wang, S Zhang, H Zhang, R Shi
Database Systems for Advanced Applications: DASFAA 2017 International …, 2017Springer
A challenge for data imputation is the lack of knowledge. In this paper, we attempt to address
this challenge by involving extra knowledge from web. To achieve high-performance web-
based imputation, we use the dependency, ie FDs and CFDs, to impute as many as possible
values automatically and fill in the other missing values with the minimal access of web,
whose cost is relatively large. To make sufficient use of dependencies, we model the
dependency set on the data as a graph and perform automatical imputation and keywords …
Abstract
A challenge for data imputation is the lack of knowledge. In this paper, we attempt to address this challenge by involving extra knowledge from web. To achieve high-performance web-based imputation, we use the dependency, i.e. FDs and CFDs, to impute as many as possible values automatically and fill in the other missing values with the minimal access of web, whose cost is relatively large. To make sufficient use of dependencies, we model the dependency set on the data as a graph and perform automatical imputation and keywords generation for web-based imputation based on such graph model. With the generated keywords, we design two algorithms to extract values for imputation from the search results. Extensive experimental results based on real-world data collections show that the proposed approach could impute missing values efficiently and effectively compared to existing approach.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果