Benchmarking of alignment-free sequence comparison methods

A Zielezinski, HZ Girgis, G Bernard, CA Leimeister… - Genome biology, 2019 - Springer
Background Alignment-free (AF) sequence comparison is attracting persistent interest driven
by data-intensive applications. Hence, many AF procedures have been proposed in recent …

Evolution of biosequence search algorithms: a brief survey

G Kucherov - Bioinformatics, 2019 - academic.oup.com
Motivation Although modern high-throughput biomolecular technologies produce various
types of data, biosequence data remain at the core of bioinformatic analyses. However …

Fast alignment-free sequence comparison using spaced-word frequencies

CA Leimeister, M Boden, S Horwege, S Lindner… - …, 2014 - academic.oup.com
Motivation: Alignment-free methods for sequence comparison are increasingly used for
genome analysis and phylogeny reconstruction; they circumvent various difficulties of …

Skmer: assembly-free and alignment-free sample identification using genome skims

S Sarmashghi, K Bohmann, MT P. Gilbert, V Bafna… - Genome biology, 2019 - Springer
The ability to inexpensively describe taxonomic diversity is critical in this era of rapid climate
and biodiversity changes. The recent genome-skimming approach extends current …

Spaced seeds improve k-mer-based metagenomic classification

K Břinda, M Sykulski, G Kucherov - Bioinformatics, 2015 - academic.oup.com
Motivation: Metagenomics is a powerful approach to study genetic content of environmental
samples, which has been strongly promoted by next-generation sequencing technologies …

Information theory in computational biology: where we stand today

P Chanda, E Costa, J Hu, S Sukumar, J Van Hemert… - Entropy, 2020 - mdpi.com
“A Mathematical Theory of Communication” was published in 1948 by Claude Shannon to
address the problems in the field of data compression and communication over (noisy) …

Fast and accurate phylogeny reconstruction using filtered spaced-word matches

CA Leimeister, S Sohrabi-Jahromi… - …, 2017 - academic.oup.com
Motivation Word-based or 'alignment-free'algorithms are increasingly used for phylogeny
reconstruction and genome comparison, since they are much faster than traditional …

The Statistics of k-mers from a Sequence Undergoing a Simple Mutation Process Without Spurious Matches

A Blanca, RS Harris, D Koslicki… - Journal of Computational …, 2022 - liebertpub.com
k-mer-based methods are widely used in bioinformatics, but there are many gaps in our
understanding of their statistical properties. Here, we consider the simple model where a …

kWIP: The k-mer weighted inner product, a de novo estimator of genetic similarity

KD Murray, C Webers, CS Ong, J Borevitz… - PLoS computational …, 2017 - journals.plos.org
Modern genomics techniques generate overwhelming quantities of data. Extracting
population genetic variation demands computationally efficient methods to determine …

The number of k-mer matches between two DNA sequences as a function of k and applications to estimate phylogenetic distances

S Röhling, A Linne, J Schellhorn, M Hosseini… - Plos one, 2020 - journals.plos.org
We study the number N k of length-k word matches between pairs of evolutionarily related
DNA sequences, as a function of k. We show that the Jukes-Cantor distance between two …