Today's genomics workflows typically require alignment to a reference sequence, which limits discovery. We introduce a unifying paradigm, SPLASH (Statistically Primary aLignment …
SPLASH is an unsupervised, reference-free, and unifying algorithm that discovers regulated sequence variation through statistical analysis of k-mer composition, subsuming many …
We present a unifying statistical formulation for many fundamental problems in genome science and develop a reference-free, highly efficient algorithm that solves it. Sequence …
K Chaung, TZ Baharav, G Henderson, P Wang… - 2022 - europepmc.org
We show that myriad, disparate mechanisms that diversify genomes and transcriptomes can be captured by a unifying principle: sample-dependent sequence variation. This variation …
E Meyer, E Saldivar, M Kokot, B Xue, S Deorowicz… - bioRxiv, 2024 - biorxiv.org
Most plant genomes and their regulation remain unknown. We used SPLASH-a new, reference-genome free sequence variation detection algorithm-to analyze transcriptional …
Recent years have seen a sustained exponential growth in the volume of data generated within the domains of data science and biology. This dramatic pace of data generation has …
TZ Baharav, D Tse, J Salzman - bioRxiv - ncbi.nlm.nih.gov
Contingency tables, data represented as counts matrices, are ubiquitous across quantitative research and data-science applications. Existing statistical tests are insufficient however, as …