A guide to machine learning for biologists

JG Greener, SM Kandathil, L Moffat… - Nature reviews Molecular …, 2022 - nature.com
The expanding scale and inherent complexity of biological data have encouraged a growing
use of machine learning in biology to build informative and predictive models of the …

A primer on deep learning in genomics

J Zou, M Huss, A Abid, P Mohammadi, A Torkamani… - Nature …, 2019 - nature.com
Deep learning methods are a class of machine learning techniques capable of identifying
highly complex patterns in large datasets. Here, we provide a perspective and primer on …

Gene2vec: gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA

Q Zou, P Xing, L Wei, B Liu - Rna, 2019 - rnajournal.cshlp.org
N 6-Methyladenosine (m6A) refers to methylation modification of the adenosine nucleotide
acid at the nitrogen-6 position. Many conventional computational methods for identifying N 6 …

Towards a comprehensive catalogue of validated and target-linked human enhancers

M Gasperini, JM Tome, J Shendure - Nature Reviews Genetics, 2020 - nature.com
The human gene catalogue is essentially complete, but we lack an equivalently vetted
inventory of bona fide human enhancers. Hundreds of thousands of candidate enhancers …

Computational methods for analysing multiscale 3D genome organization

Y Zhang, L Boninsegna, M Yang, T Misteli… - Nature Reviews …, 2024 - nature.com
Recent progress in whole-genome mapping and imaging technologies has enabled the
characterization of the spatial organization and folding of the genome in the nucleus. In …

Applications of transformer-based language models in bioinformatics: a survey

S Zhang, R Fan, Y Liu, S Chen, Q Liu… - Bioinformatics …, 2023 - academic.oup.com
The transformer-based language models, including vanilla transformer, BERT and GPT-3,
have achieved revolutionary breakthroughs in the field of natural language processing …

A review of deep learning applications in human genomics using next-generation sequencing data

WS Alharbi, M Rashid - Human Genomics, 2022 - Springer
Genomics is advancing towards data-driven science. Through the advent of high-throughput
data generating technologies in human genomics, we are overwhelmed with the heap of …

Reaching the end-game for GWAS: machine learning approaches for the prioritization of complex disease loci

HL Nicholls, CR John, DS Watson, PB Munroe… - Frontiers in …, 2020 - frontiersin.org
Genome-wide association studies (GWAS) have revealed thousands of genetic loci that
underpin the complex biology of many human traits. However, the strength of GWAS–the …

Deciphering microbial gene function using natural language processing

D Miller, A Stern, D Burstein - Nature Communications, 2022 - nature.com
Revealing the function of uncharacterized genes is a fundamental challenge in an era of
ever-increasing volumes of sequencing data. Here, we present a concept for tackling this …

Interpretation of deep learning in genomics and epigenomics

A Talukder, C Barham, X Li, H Hu - Briefings in Bioinformatics, 2021 - academic.oup.com
Abstract Machine learning methods have been widely applied to big data analysis in
genomics and epigenomics research. Although accuracy and efficiency are common goals …