查看文章

openproceedings.org 中的 [PDF]

Adaptive String Dictionary Compression in In-Memory Column-Store Database Systems

作者

Ingo Müller, Cornelius Ratsch, Franz Faerber

发表日期

2014/4

研讨会论文

17th International Conference on Extending Database Technology (EDBT) – 2014

页码范围

283–294

简介

Domain encoding is a common technique to compress the columns of a column store and to accelerate many types of queries at the same time. It is based on the assumption that most columns contain a relatively small set of distinct values, in particular string columns. In this paper, we argue that domain encoding is not the end of the story. In real world systems, we observe that a substantial amount of the columns are of string types. Moreover, most of the memory space is consumed by only a small fraction of these columns. To address this issue, we make three main contributions: First we survey several approaches and variants for dictionary compression, ie, data structures that store the dictionary of domain encoding in a compressed way. As expected, there is a trade-off between size of the data structure and its access performance. This observation can be used to compress rarely accessed data more than frequently accessed data. Furthermore the question which approach has the best compression ratio for a certain column heavily depends on specific characteristics of its content. Consequently, as a second contribution, we present non-trivial sampling schemes for all our dictionary formats, enabling us to estimate their size for a given column. This way it is possible to identify compression schemes specialized for the content of a specific column.

Third, we draft how to fully automate the decision of the dictionary format. We sketch a compression manager that selects the most appropriate dictionary format based on column access and update patterns, characteristics of the underlying data, and costs for set-up and access of the different data …

引用总数

被引用次数：77

2013201420152016201720182019202020212022202320241 2 2 6 3 6 6 13 14 11 9 4

学术搜索中的文章

Adaptive String Dictionary Compression in In-Memory Column-Store Database Systems.

I Müller, C Ratsch, F Faerber - EDBT, 2014

被引用次数：77 相关文章所有 7 个版本