Full-text indexes provide fast substring search over large text collections. A serious problem of these indexes has traditionally been their space consumption. A recent trend is to develop …
H Li, R Durbin - bioinformatics, 2009 - academic.oup.com
Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first …
Information retrieval (IR) has changed considerably in recent years with the expansion of the World Wide Web and the advent of modern and inexpensive graphical user interfaces and …
Indexing highly repetitive texts—such as genomic databases, software repositories and versioned text collections—has become an important problem since the turn of the …
P Ferragina, G Manzini - Proceedings 41st annual symposium …, 2000 - ieeexplore.ieee.org
We address the issue of compressing and indexing data. We devise a data structure whose space occupancy is a function of the entropy of the underlying data set. We call the data …
Parallel technology boosts data processing in recent years, and parallel direct data processing on hierarchically compressed documents exhibits great promise. The high …
We consider the indexable dictionary problem, which consists of storing a set S⊆{0,…, m− 1} for some integer m while supporting the operations of rank (x), which returns the number …
We design two compressed data structures for the full-text indexing problem that support efficient substring searches using roughly the space required for storing the text in …
MI Abouelhoda, S Kurtz, E Ohlebusch - Journal of discrete algorithms, 2004 - Elsevier
The suffix tree is one of the most important data structures in string processing and comparative genomics. However, the space consumption of the suffix tree is a bottleneck in …