When less is more: sketching with minimizers in genomics

M Ndiaye, S Prieto-Baños, LM Fitzgerald… - Genome Biology, 2024 - Springer
The exponential increase in sequencing data calls for conceptual and computational
advances to extract useful biological insights. One such advance, minimizers, allows for …

Creating and using minimizer sketches in computational genomics

H Zheng, G Marçais, C Kingsford - Journal of Computational …, 2023 - liebertpub.com
Processing large data sets has become an essential part of computational genomics.
Greatly increased availability of sequence data from multiple sources has fueled …

Complete sequencing of ape genomes

DA Yoo, A Rhie, P Hebbar, F Antonacci… - BioRxiv, 2024 - pmc.ncbi.nlm.nih.gov
We present haplotype-resolved reference genomes and comparative analyses of six ape
species, namely: chimpanzee, bonobo, gorilla, Bornean orangutan, Sumatran orangutan …

The mod-minimizer: a simple and efficient sampling algorithm for long k-mers

RG Koerkamp, GE Pibiri - bioRxiv, 2024 - biorxiv.org
Motivation Given a string S, a minimizer scheme is an algorithm defined by a triple (k, w, 𝒪)
that samples a subset of k-mers (k-long substrings) from a string S. Specifically, it samples …

A near-tight lower bound on the density of forward sampling schemes

B Kille, R Groot Koerkamp, D McAdams, A Liu… - …, 2025 - academic.oup.com
Motivation Sampling k-mers is a ubiquitous task in sequence analysis algorithms. Sampling
schemes such as the often-used random minimizer scheme are particularly appealing as …

ModDotPlot—rapid and interactive visualization of tandem repeats

AP Sweeten, MC Schatz, AM Phillippy - Bioinformatics, 2024 - academic.oup.com
Motivation A common method for analyzing genomic repeats is to produce a sequence
similarity matrix visualized via a dot plot. Innovative approaches such as StainedGlass have …

PneumoBrowse 2: an integrated visual platform for curated genome annotation and multiomics data analysis of Streptococcus pneumoniae

AB Janssen, PS Gibson, AM Bravo… - Nucleic Acids …, 2025 - academic.oup.com
Streptococcus pneumoniae is an opportunistic human pathogen responsible for high
morbidity and mortality rates. Extensive genome sequencing revealed its large pangenome …

Polyploidization-driven transcriptomic dynamics in Medicago sativa neotetraploids: mRNA, smRNA and allele-specific gene expression

DF Santoro, G Marconi, S Capomaccio, M Bocchini… - BMC Plant …, 2025 - Springer
Whole genome duplication (WGD) is a powerful evolutionary mechanism in plants.
Autopolyploids have been comparatively less studied than allopolyploids, with sexual …

Genome assembly in the telomere-to-telomere era

H Li, R Durbin - Nature Reviews Genetics, 2024 - nature.com
Genome sequences largely determine the biology and encode the history of an organism,
and de novo assembly—the process of reconstructing the genome sequence of an organism …

The open-closed mod-minimizer algorithm

R Groot Koerkamp, D Liu, GE Pibiri - bioRxiv, 2024 - biorxiv.org
Sampling algorithms that deterministically select a subset of k-mers are an important
building block in bioinformatics applications. For example, they are used to index large …