Compressed text indexes: From theory to practice

P Ferragina, R González, G Navarro… - Journal of Experimental …, 2009 - dl.acm.org
A compressed full-text self-index represents a text in a compressed form and still answers
queries efficiently. This represents a significant advancement over the (full-) text indexing …

Faster entropy-bounded compressed suffix trees

J Fischer, V Mäkinen, G Navarro - Theoretical Computer Science, 2009 - Elsevier
Suffix trees are among the most important data structures in stringology, with a number of
applications in flourishing areas like bioinformatics. Their main problem is space usage …

Text indexing for long patterns: Anchors are all you need

L Ayad, G Loukidis, S Pissis - Proceedings of the VLDB Endowment …, 2023 - kclpure.kcl.ac.uk
In many real-world database systems, a large fraction of the data is represented by strings:
sequences of letters over some alphabet. This is because strings can easily encode data …

Alphabet-dependent string searching with wexponential search trees

J Fischer, P Gawrychowski - … : 26th Annual Symposium, CPM 2015, Ischia …, 2015 - Springer
We consider finding a pattern of length mm in a compacted (linear-size) trie storing strings
over an alphabet of size σ σ. In static tries, we achieve O (m+\lg\lg σ) O (m+ lg lg σ) …

Breaking the 𝒪(n)-Barrier in the Construction of Compressed Suffix Arrays and Suffix Trees

D Kempa, T Kociumaka - Proceedings of the 2023 Annual ACM-SIAM …, 2023 - SIAM
The suffix array, describing the lexicographical order of suffixes of a given text, and the suffix
tree, a path-compressed trie of all suffixes, are the two most fundamental data structures for …

Bidirectional string anchors: A new string sampling mechanism

G Loukides, S Pissis - … 2021-29th Annual European Symposium on …, 2021 - inria.hal.science
The minimizers sampling mechanism is a popular mechanism for string sampling introduced
independently by Schleimer et al.[SIGMOD 2003] and by Roberts et al.[Bioinf. 2004]. Given …

Bidirectional String Anchors for Improved Text Indexing and Top- Similarity Search

G Loukides, SP Pissis… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
The minimizers sampling mechanism is a popular mechanism for string sampling. However,
minimizers sampling mechanisms lack good guarantees on the expected size of their …

Wee lcp

J Fischer - Information Processing Letters, 2010 - Elsevier
We prove that longest common prefix (LCP) information can be stored in much less space
than previously known. More precisely, we show that in the presence of the text and the …

Deterministic indexing for packed strings

P Bille, IL Gørtz, FR Skjoldjensen - arXiv preprint arXiv:1612.01748, 2016 - arxiv.org
Given a string $ S $ of length $ n $, the classic string indexing problem is to preprocess $ S $
into a compact data structure that supports efficient subsequent pattern queries. In the\emph …

Managing unbounded-length keys in comparison-driven data structures with applications to online indexing

A Amir, G Franceschini, R Grossi, T Kopelowitz… - SIAM Journal on …, 2014 - SIAM
This paper presents a general technique for optimally transforming any dynamic data
structure that operates on atomic and indivisible keys by constant-time comparisons, into a …