Stingy sketch: a sketch framework for accurate and fast frequency estimation

H Li, Q Chen, Y Zhang, T Yang, B Cui - Proceedings of the VLDB …, 2022 - dl.acm.org
Recording the frequency of items in highly skewed data streams is a fundamental and hot
problem in recent years. The literature demonstrates that sketch is the most promising …

Double-Anonymous Sketch: Achieving Top-K-fairness for Finding Global Top-K Frequent Items

Y Zhao, W Han, Z Zhong, Y Zhang, T Yang… - Proceedings of the ACM …, 2023 - dl.acm.org
Finding top-K frequent items has been a hot topic in data stream processing in recent years,
which has a wide range of applications. However, most of existing sketch algorithms focuses …

Bitmatcher: Bit-level counter adjustment for sketches

Q Shi, C Jia, W Li, Z Liu, T Yang, J Ji… - 2024 IEEE 40th …, 2024 - ieeexplore.ieee.org
Sketch has been widely used in the field of large-scale data stream processing. However,
common fixed-counter algorithms such as Count-Min Sketch have to allocate larger …

MicroscopeSketch: Accurate Sliding Estimation Using Adaptive Zooming

Y Wu, S Jiang, S Dong, Z Zhong, J Chen, Y Hu… - Proceedings of the 29th …, 2023 - dl.acm.org
High-accuracy real-time data stream estimations are critical for various applications, and
sliding-window-based techniques have attracted wide attention. However, existing solutions …

Hypercalm sketch: One-pass mining periodic batches in data streams

Z Liu, C Kong, K Yang, T Yang, R Miao… - 2023 IEEE 39th …, 2023 - ieeexplore.ieee.org
Batch is an important pattern in data streams, which refers to a group of identical items that
arrive closely. We find that some special batches that arrive periodically are of great value. In …

Achieving Top--fairness for Finding Global Top- Frequent Items

Y Zhao, W Zhou, W Han, Z Zhong… - … on Knowledge and …, 2024 - ieeexplore.ieee.org
Finding top-frequent items has been a hot topic in data stream processing with wide-ranging
applications. However, most existing sketch algorithms focus on finding local top-in a single …

A Unified Framework for Mining Batch and Periodic Batch in Data Streams

Z Liu, X Wang, Y Wu, T Yang, K Yang… - … on Knowledge and …, 2024 - ieeexplore.ieee.org
Batch is an important pattern in data streams, which refers to a group of identical items that
arrive closely. We find that some special batches that arrive periodically are of great value. In …

CodingSketch: A Hierarchical Sketch with Efficient Encoding and Recursive Decoding

Q Chen, Y Hong, Y Wu, T Yang… - 2024 IEEE 40th …, 2024 - ieeexplore.ieee.org
Sketch is a probabilistic data structure widely used in various fields due to its high accuracy
under small memory. Designing hierarchical data structures for real-world datasets with high …

A Compact and Accurate Sketch for Estimating a Large Range of Set Difference Cardinalities

P Jia, P Wang, R Li, J Zhao, J Feng… - 2024 IEEE 40th …, 2024 - ieeexplore.ieee.org
Computing set difference cardinalities is a critical task in database optimization, network
management, and anomaly detection. Due to the limited computational and mem-ory …

Pb-Hash: Partitioned b-bit Hashing

P Li, W Zhao - Proceedings of the 2024 ACM SIGIR International …, 2024 - dl.acm.org
Many hashing algorithms including minwise hashing (MinHash), one permutation hashing
(OPH), and consistent weighted sampling (CWS) generate integers of B bits. With k hashes …