Fast dictionary-based compression for inverted indexes

GE Pibiri, M Petri, A Moffat - … of the twelfth ACM international conference …, 2019 - dl.acm.org
Dictionary-based compression schemes provide fast decoding operation, typically at the
expense of reduced compression effectiveness compared to statistical or probability-based …

Large-alphabet semi-static entropy coding via asymmetric numeral systems

A Moffat, M Petri - ACM Transactions on Information Systems (TOIS), 2020 - dl.acm.org
An entropy coder takes as input a sequence of symbol identifiers over some specified
alphabet and represents that sequence as a bitstring using as few bits as possible, typically …

Individually optimal single-and multiple-tree almost instantaneous variable-to-fixed codes

D Dube, F Haddad - 2018 IEEE International Symposium on …, 2018 - ieeexplore.ieee.org
Variable-to-fixed (VF) codes are often based on dictionaries that obey the prefix-free
property; eg, the Tunstall codes. However, correct VF codes need not be prefix free …

Improving Marlin's compression ratio with partially overlapping codewords

M Martinez, K Sandfort, D Dubé… - 2018 Data …, 2018 - ieeexplore.ieee.org
Marlin [1] is a Variable-to-Fixed (VF) codec optimized for decoding speed. To achieve its
speed, Marlin does not encode the current state of the input source, penalyzing compression …

High-throughput variable-to-fixed entropy codec using selective, stochastic code forests

MM Torres, M Hernandez-Cabronero, I Blanes… - IEEE …, 2020 - ieeexplore.ieee.org
Efficient high-throughput (HT) compression algorithms are paramount to meet the stringent
constraints of present and upcoming data storage, processing, and transmission systems. In …

Space and Time-Efficient Data Structures for Massive Datasets

GE Pibiri - 2019 - tesidottorato.depositolegale.it
This thesis concerns the design of compressed data structures for the efficient storage of
massive datasets of integer sequences and short strings. The studied problems arise in …

MIHBS: A mobile interface of high bandwidth for wireless sensor networks

L Sun, L Wang, J Fang, J Liu, C Ma - IEEE Access, 2018 - ieeexplore.ieee.org
In modern days, wireless sensor networks and smart phones have been widely used in
various application domains including healthcare, environment, and intelligent building …

[PDF][PDF] Compression and Pattern Matching

T Kida, I Furuya - … Paradigm: Algorithmic Revolution in the Big …, 2022 - library.oapen.org
We introduce our research on compressed pattern matching technology that combines data
compression and pattern matching. To show the results of this work, we explain the collage …

Efficient Codebook Constructions of AIVF Codes

Y You, SJ Lin - 2021 IEEE International Symposium on …, 2021 - ieeexplore.ieee.org
The almost instantaneous variable-to-fixed (AIVF) code is a class of non-prefix codes that
parses data sequences into some fixed-length codewords. Recently, the dictionary …

Rice-Marlin Codes: Tiny and Efficient Variable-to-Fixed Codes

M Martinez, J Serra-Sagristà - arXiv preprint arXiv:1811.05756, 2018 - arxiv.org
Marlin is a Variable-to-Fixed (VF) codec optimized for high decoding speed through the use
of small sized dictionaries that fit in the L1 cache of most CPUs. While the size of Marlin …