External memory BWT and LCP computation for sequence collections with applications

L Egidi, FA Louza, G Manzini, GP Telles - Algorithms for Molecular Biology, 2019 - Springer
Background Sequencing technologies produce larger and larger collections of
biosequences that have to be stored in compressed indices supporting fast search …

FSG: fast string graph construction for de novo assembly

P Bonizzoni, GD Vedova, Y Pirola… - Journal of …, 2017 - liebertpub.com
The string graph for a collection of next-generation reads is a lossless data representation
that is fundamental for de novo assemblers based on the overlap-layout-consensus …

A representation of a compressed de Bruijn graph for pan-genome analysis that enables search

T Beller, E Ohlebusch - Algorithms for Molecular Biology, 2016 - Springer
Abstract Background Recently, Marcus et al.(Bioinformatics 30: 3476–83, 2014) proposed to
use a compressed de Bruijn graph to describe the relationship between the genomes of …

LSG: an external-memory tool to compute string graphs for next-generation sequencing data assembly

P Bonizzoni, GD Vedova, Y Pirola… - Journal of …, 2016 - liebertpub.com
The large amount of short read data that has to be assembled in future applications, such as
in metagenomics or cancer genomics, strongly motivates the investigation of disk-based …

An external-memory algorithm for string graph construction

P Bonizzoni, G Della Vedova, Y Pirola, M Previtali… - Algorithmica, 2017 - Springer
Some recent results (Bauer et al. in Algorithms in bioinformatics, Springer, Berlin, pp 326–
337, 2012; Cox et al. in Algorithms in bioinformatics, Springer, Berlin, pp. 214–224, 2012; …

Simulating the DNA overlap graph in succinct space

D Díaz-Domínguez, T Gagie… - 30th Annual Symposium …, 2019 - drops.dagstuhl.de
Converting a set of sequencing reads into a lossless compact data structure that encodes all
the relevant biological information is a major challenge. The classical approaches are to …

Simulating the dna string graph in succinct space

D Díaz-Domínguez, T Gagie, G Navarro - arXiv preprint arXiv:1901.10453, 2019 - arxiv.org
Converting a set of sequencing reads into a lossless compact data structure that encodes all
the relevant biological information is a major challenge. The classical approaches are to …

A New Lightweight Algorithm to compute the BWT and the LCP array of a Set of Strings

P Bonizzoni, G Della Vedova, S Nicosia… - arXiv preprint arXiv …, 2016 - arxiv.org
Indexing of very large collections of strings such as those produced by the widespread
sequencing technologies, heavily relies on multi-string generalizations of the Burrows …

FSG: fast string graph construction for de novo assembly of reads data

P Bonizzoni, G Della Vedova, Y Pirola… - … and Applications: 12th …, 2016 - Springer
The string graph for a collection of next-generation reads is a lossless data representation
that is fundamental for de novo assemblers based on the overlap-layout-consensus …

Self-indexing for de novo assembly

M Previtali - 2017 - boa.unimib.it
For at least three decades computer science approaches have proved to be of utmost
importance for extrapolating knowledge from biological data. Indeed, the amount of data …