Compressed full-text indexes

G Navarro, V Mäkinen - ACM Computing Surveys (CSUR), 2007 - dl.acm.org
Full-text indexes provide fast substring search over large text collections. A serious problem
of these indexes has traditionally been their space consumption. A recent trend is to develop …

SeqAn an efficient, generic C++ library for sequence analysis

A Döring, D Weese, T Rausch, K Reinert - BMC bioinformatics, 2008 - Springer
Background The use of novel algorithmic techniques is pivotal to many important problems
in life science. For example the sequencing of the human genome [1] would not have been …

[图书][B] Handbook of computational molecular biology

S Aluru - 2005 - taylorfrancis.com
The enormous complexity of biological systems at the molecular level must be answered
with powerful computational methods. Computational biology is a young field, but has seen …

Indexing methods for approximate dictionary searching: Comparative analysis

L Boytsov - Journal of Experimental Algorithmics (JEA), 2011 - dl.acm.org
The primary goal of this article is to survey state-of-the-art indexing methods for approximate
dictionary searching. To improve understanding of the field, we introduce a taxonomy that …

Compressed text indexes: From theory to practice

P Ferragina, R González, G Navarro… - Journal of Experimental …, 2009 - dl.acm.org
A compressed full-text self-index represents a text in a compressed form and still answers
queries efficiently. This represents a significant advancement over the (full-) text indexing …

A microbial detection array (MDA) for viral and bacterial detection

SN Gardner, CJ Jaing, KS McLoughlin, TR Slezak - BMC genomics, 2010 - Springer
Background Identifying the bacteria and viruses present in a complex sample is useful in
disease diagnostics, product safety, environmental characterization, and research. Array …

Fast and accurate read mapping with approximate seeds and multiple backtracking

E Siragusa, D Weese, K Reinert - Nucleic acids research, 2013 - academic.oup.com
We present Masai, a read mapper representing the state-of-the-art in terms of speed and
accuracy. Our tool is an order of magnitude faster than RazerS 3 and mrFAST, 2–4 times …

Prospects and limitations of full-text index structures in genome analysis

M Vyverman, B De Baets, V Fack… - Nucleic acids …, 2012 - academic.oup.com
The combination of incessant advances in sequencing technology producing large amounts
of data and innovative bioinformatics approaches, designed to cope with this data flood, has …

Practical methods for constructing suffix trees

Y Tian, S Tata, RA Hankins, JM Patel - The VLDB Journal, 2005 - Springer
Sequence datasets are ubiquitous in modern life-science applications, and querying
sequences is a common and critical operation in many of these applications. The suffix tree …

Genome-scale disk-based suffix tree indexing

B Phoophakdee, MJ Zaki - Proceedings of the 2007 ACM SIGMOD …, 2007 - dl.acm.org
With the exponential growth of biological sequence databases, it has become critical to
develop effective techniques for storing, querying, and analyzing these massive data. Suffix …