Machine learning for integrating data in biology and medicine: Principles, practice, and opportunities

M Zitnik, F Nguyen, B Wang, J Leskovec… - Information …, 2019 - Elsevier
New technologies have enabled the investigation of biology and human health at an
unprecedented scale and in multiple dimensions. These dimensions include a myriad of …

Managing the steady state chromatin landscape by nucleosome dynamics

K Ahmad, S Henikoff… - Annual review of …, 2022 - annualreviews.org
Gene regulation arises out of dynamic competition between nucleosomes, transcription
factors, and other chromatin proteins for the opportunity to bind genomic DNA. The …

[PDF][PDF] Cicero predicts cis-regulatory DNA interactions from single-cell chromatin accessibility data

HA Pliner, JS Packer, JL McFaline-Figueroa… - Molecular cell, 2018 - cell.com
Linking regulatory DNA elements to their target genes, which may be located hundreds of
kilobases away, remains challenging. Here, we introduce Cicero, an algorithm that identifies …

[HTML][HTML] Avocado: a multi-scale deep tensor factorization method learns a latent representation of the human epigenome

J Schreiber, T Durham, J Bilmes, WS Noble - Genome biology, 2020 - Springer
The human epigenome has been experimentally characterized by thousands of
measurements for every basepair in the human genome. We propose a deep neural …

[PDF][PDF] FUN-LDA: a latent dirichlet allocation model for predicting tissue-specific functional effects of noncoding variation: methods and applications

D Backenroth, Z He, K Kiryluk, V Boeva… - The American Journal of …, 2018 - cell.com
We describe a method based on a latent Dirichlet allocation model for predicting functional
effects of noncoding genetic variants in a cell-type-and/or tissue-specific way (FUN-LDA) …

A flexible repertoire of transcription factor binding sites and a diversity threshold determines enhancer activity in embryonic stem cells

G Singh, S Mullany, SD Moorthy, R Zhang… - Genome …, 2021 - genome.cshlp.org
Transcriptional enhancers are critical for development and phenotype evolution and are
often mutated in disease contexts; however, even in well-studied cell types, the sequence …

Robust chromatin state annotation

MF Shahraki, M Farahbod, MW Libbrecht - Genome Research, 2024 - genome.cshlp.org
With the goal of mapping genomic activity, international projects have recently measured
epigenetic activity in hundreds of cell and tissue types. Chromatin state annotations …

[HTML][HTML] Deregulated regulators: disease-causing cis variants in transcription factor genes

R van Der Lee, S Correard, WW Wasserman - Trends in Genetics, 2020 - cell.com
Whole-genome sequencing is accelerating identification of noncoding variants that disrupt
gene expression, although reports of such regulatory variants implicated in disease remain …

Machine learning methods to model multicellular complexity and tissue specificity

RSG Sealfon, AK Wong, OG Troyanskaya - Nature Reviews Materials, 2021 - nature.com
Experimental approaches to study tissue specificity enable insight into the nature and
organization of the cell types and tissues that constitute complex multicellular organisms …

GREEN-DB: a framework for the annotation and prioritization of non-coding regulatory variants from whole-genome sequencing data

E Giacopuzzi, N Popitsch, JC Taylor - Nucleic Acids Research, 2022 - academic.oup.com
Non-coding variants have long been recognized as important contributors to common
disease risks, but with the expansion of clinical whole genome sequencing, examples of …