An empirical evaluation of columnar storage formats

X Zeng, Y Hui, J Shen, A Pavlo, W McKinney… - arXiv preprint arXiv …, 2023 - arxiv.org
Columnar storage is a core component of a modern data analytics system. Although many
database management systems (DBMSs) have proprietary storage formats, most provide …

The LSM design space and its read optimizations

S Sarkar, N Dayan… - 2023 IEEE 39th …, 2023 - ieeexplore.ieee.org
Log-structured merge (LSM) trees have emerged as one of the most commonly used storage-
based data structures in modern data systems as they offer high throughput for writes and …

Rencoder: A space-time efficient range filter with local encoder

Z Wang, Z Zhong, J Guo, Y Wu, H Li… - 2023 IEEE 39th …, 2023 - ieeexplore.ieee.org
A range filter is a data structure to answer range membership queries. Range queries are
common in modern applications, and range filters have gained rising attention for improving …

Infinifilter: Expanding filters to infinity and beyond

N Dayan, I Bercea, P Reviriego, R Pagh - … of the ACM on Management of …, 2023 - dl.acm.org
Filter data structures have been used ubiquitously since the 1970s to answer approximate
set-membership queries in various areas of computer science including architecture …

Oasis: An Optimal Disjoint Segmented Learned Range Filter

G Chen, Z He, M Li, S Luo - Proceedings of the VLDB Endowment, 2024 - dl.acm.org
The learning-enhanced data structure has inspired the development of the range filter,
bringing significantly better false positive rate (FPR) than traditional non-learned range …

QUIC-FL: Quick Unbiased Compression for Federated Learning

RB Basat, S Vargaftik, A Portnoy, G Einziger… - arXiv preprint arXiv …, 2022 - arxiv.org
Distributed Mean Estimation (DME), in which $ n $ clients communicate vectors to a
parameter server that estimates their average, is a fundamental building block in …

Structural Designs Meet Optimality: Exploring Optimized LSM-tree Structures in A Colossal Configuration Space

J Liu, F Wang, D Mo, S Luo - Proceedings of the ACM on Management …, 2024 - dl.acm.org
Mainstream LSM-tree-based key-value stores face challenges in optimizing performance for
point lookup, range lookup, and update operations concurrently due to their constrained …

GRF: A Global Range Filter for LSM-Trees with Shape Encoding

H Wang, T Guo, J Yang, H Zhang - … of the ACM on Management of Data, 2024 - dl.acm.org
Log-structured merge-trees (LSM-trees) are widely used in key-value stores because of its
excellent write performance. To reduce LSM-tree's read amplification due to overlapping …

MirrorKV: An Efficient Key-Value Store on Hybrid Cloud Storage with Balanced Performance of Compaction and Querying

Z Wang, Z Shao - Proceedings of the ACM on Management of Data, 2023 - dl.acm.org
LSM-based key-value stores have been leveraged in many state-of-the-art data-intensive
applications as storage engines. As data volume scales up, a cost-efficient approach is to …

Aleph Filter: To Infinity in Constant Time

N Dayan, I Bercea, R Pagh - arXiv preprint arXiv:2404.04703, 2024 - arxiv.org
Filter data structures are widely used in various areas of computer science to answer
approximate set-membership queries. In many applications, the data grows dynamically …