Fast and accurate phylogeny reconstruction using filtered spaced-word matches

CA Leimeister, S Sohrabi-Jahromi… - …, 2017 - academic.oup.com
Motivation Word-based or 'alignment-free'algorithms are increasingly used for phylogeny
reconstruction and genome comparison, since they are much faster than traditional …

The number of k-mer matches between two DNA sequences as a function of k and applications to estimate phylogenetic distances

S Röhling, A Linne, J Schellhorn, M Hosseini… - Plos one, 2020 - journals.plos.org
We study the number N k of length-k word matches between pairs of evolutionarily related
DNA sequences, as a function of k. We show that the Jukes-Cantor distance between two …

Minimally overlapping words for sequence similarity search

MC Frith, L Noé, G Kucherov - Bioinformatics, 2020 - academic.oup.com
Motivation Analysis of genetic sequences is usually based on finding similar parts of
sequences, eg DNA reads and/or genomes. For big data, this is typically done via 'seeds' …

Prot-SpaM: fast alignment-free phylogeny reconstruction based on whole-proteome sequences

CA Leimeister, J Schellhorn, S Dörrer, M Gerth… - …, 2019 - academic.oup.com
Word-based or 'alignment-free'sequence comparison has become an active research area
in bioinformatics. While previous word-frequency approaches calculated rough measures of …

Read-SpaM: assembly-free and alignment-free comparison of bacterial genomes with low sequencing coverage

AK Lau, S Dörrer, CA Leimeister, C Bleidorn… - BMC …, 2019 - Springer
Background In many fields of biomedical research, it is important to estimate phylogenetic
distances between taxa based on low-coverage sequencing reads. Major applications are …

Mismatch-tolerant, alignment-free sequence classification using multiple spaced seeds and multiindex Bloom filters

J Chu, H Mohamadi, E Erhan, J Tse… - Proceedings of the …, 2020 - National Acad Sciences
Alignment-free classification tools have enabled high-throughput processing of sequencing
data in many bioinformatics analysis pipelines primarily due to their computational …

Fast gapped k-mer counting with subdivided multi-way bucketed cuckoo hash tables

J Zentgraf, S Rahmann - 22nd International Workshop on …, 2022 - drops.dagstuhl.de
Motivation. In biological sequence analysis, alignment-free (also known as k-mer-based)
methods are increasingly replacing mapping-and alignment-based methods for various …

'Multi-SpaM': a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees

T Dencker, CA Leimeister, M Gerth… - NAR Genomics and …, 2020 - academic.oup.com
Word-based or 'alignment-free'methods for phylogeny inference have become popular in
recent years. These methods are much faster than traditional, alignment-based approaches …

Efficient computation of spaced seed hashing with block indexing

S Girotto, M Comin, C Pizzi - BMC bioinformatics, 2018 - Springer
Background Spaced-seeds, ie patterns in which some fixed positions are allowed to be wild-
cards, play a crucial role in several bioinformatics applications involving substrings counting …

S-conLSH: alignment-free gapped mapping of noisy long reads

A Chakraborty, B Morgenstern, S Bandyopadhyay - BMC bioinformatics, 2021 - Springer
Background The advancement of SMRT technology has unfolded new opportunities of
genome analysis with its longer read length and low GC bias. Alignment of the reads to their …