作者
Teodora Sandra Buda, Thomas Cerqueus, John Murphy, Morten Kristiansen
发表日期
2013/8/14
研讨会论文
2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)
页码范围
153-160
出版商
IEEE
简介
In a wide range of application areas (e.g. data mining, approximate query evaluation, histogram construction), database sampling has proved to be a powerful technique. It is generally used when the computational cost of processing large amounts of information is extremely high, and a faster response with a lower level of accuracy for the results is preferred. Previous sampling techniques achieve this balance, however, an evaluation of the cost of the database sampling process should be considered. We argue that the performance of current relational database sampling techniques that maintain the data integrity of the sample database is low and a faster strategy needs to be devised. In this paper we propose a very fast sampling method that maintains the referential integrity of the sample database intact. The sampling method targets the production environment of a system under development, that generally …
引用总数
201420152016201720182019202020212121
学术搜索中的文章
TS Buda, T Cerqueus, J Murphy, M Kristiansen - 2013 IEEE 14th International Conference on …, 2013