OASIS: An interpretable, finite-sample valid alternative to Pearson's X2 for scientific discovery

TZ Baharav, D Tse, J Salzman - Proceedings of the …, 2024 - National Acad Sciences
Contingency tables, data represented as counts matrices, are ubiquitous across quantitative
research and data-science applications. Existing statistical tests are insufficient however, as …

SPLASH: a statistical, reference-free genomic algorithm unifies biological discovery

K Chaung, TZ Baharav, G Henderson, IN Zheludev… - Cell, 2023 - cell.com
Today's genomics workflows typically require alignment to a reference sequence, which
limits discovery. We introduce a unifying paradigm, SPLASH (Statistically Primary aLignment …

[HTML][HTML] SPLASH2 provides ultra-efficient, scalable, and unsupervised discovery on raw sequencing reads

M Kokot, R Dehghannasiri, T Baharav, J Salzman… - BioRxiv, 2023 - ncbi.nlm.nih.gov
SPLASH is an unsupervised, reference-free, and unifying algorithm that discovers regulated
sequence variation through statistical analysis of k-mer composition, subsuming many …

A statistical, reference-free algorithm subsumes myriad problems in genome science and enables novel discovery

K Chaung, T Baharav, I Zheludev, J Salzman - BioRxiv, 2022 - biorxiv.org
We present a unifying statistical formulation for many fundamental problems in genome
science and develop a reference-free, highly efficient algorithm that solves it. Sequence …

[HTML][HTML] A statistical reference-free algorithm subsumes and generalizes common genomic sequence analysis and uncovers novel biological regulation

K Chaung, TZ Baharav, G Henderson, P Wang… - 2022 - europepmc.org
We show that myriad, disparate mechanisms that diversify genomes and transcriptomes can
be captured by a unifying principle: sample-dependent sequence variation. This variation …

A reference-free algorithm discovers regulation in the plant transcriptome

E Meyer, E Saldivar, M Kokot, B Xue, S Deorowicz… - bioRxiv, 2024 - biorxiv.org
Most plant genomes and their regulation remain unknown. We used SPLASH-a new,
reference-genome free sequence variation detection algorithm-to analyze transcriptional …

Adaptive Algorithms for Data Science and Computational Genomics

TZ Baharav - 2023 - search.proquest.com
Recent years have seen a sustained exponential growth in the volume of data generated
within the domains of data science and biology. This dramatic pace of data generation has …

[HTML][HTML] OASIS: An interpretable, finite-sample valid alternative to Pearson's [... formula...] for scientific discovery

TZ Baharav, D Tse, J Salzman - bioRxiv - ncbi.nlm.nih.gov
Contingency tables, data represented as counts matrices, are ubiquitous across quantitative
research and data-science applications. Existing statistical tests are insufficient however, as …