Suffix trees are among the most important data structures in stringology, with a number of applications in flourishing areas like bioinformatics. Their main problem is space usage …
L Ayad, G Loukidis, S Pissis - Proceedings of the VLDB Endowment …, 2023 - kclpure.kcl.ac.uk
In many real-world database systems, a large fraction of the data is represented by strings: sequences of letters over some alphabet. This is because strings can easily encode data …
We consider finding a pattern of length mm in a compacted (linear-size) trie storing strings over an alphabet of size σ σ. In static tries, we achieve O (m+\lg\lg σ) O (m+ lg lg σ) …
D Kempa, T Kociumaka - Proceedings of the 2023 Annual ACM-SIAM …, 2023 - SIAM
The suffix array, describing the lexicographical order of suffixes of a given text, and the suffix tree, a path-compressed trie of all suffixes, are the two most fundamental data structures for …
G Loukides, S Pissis - … 2021-29th Annual European Symposium on …, 2021 - inria.hal.science
The minimizers sampling mechanism is a popular mechanism for string sampling introduced independently by Schleimer et al.[SIGMOD 2003] and by Roberts et al.[Bioinf. 2004]. Given …
G Loukides, SP Pissis… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
The minimizers sampling mechanism is a popular mechanism for string sampling. However, minimizers sampling mechanisms lack good guarantees on the expected size of their …
J Fischer - Information Processing Letters, 2010 - Elsevier
We prove that longest common prefix (LCP) information can be stored in much less space than previously known. More precisely, we show that in the presence of the text and the …
P Bille, IL Gørtz, FR Skjoldjensen - arXiv preprint arXiv:1612.01748, 2016 - arxiv.org
Given a string $ S $ of length $ n $, the classic string indexing problem is to preprocess $ S $ into a compact data structure that supports efficient subsequent pattern queries. In the\emph …
This paper presents a general technique for optimally transforming any dynamic data structure that operates on atomic and indivisible keys by constant-time comparisons, into a …