Graph representation learning in biomedicine and healthcare

MM Li, K Huang, M Zitnik - Nature Biomedical Engineering, 2022 - nature.com
Networks—or graphs—are universal descriptors of systems of interacting elements. In
biomedicine and healthcare, they can represent, for example, molecular interactions …

[HTML][HTML] The language of proteins: NLP, machine learning & protein sequences

D Ofer, N Brandes, M Linial - Computational and Structural Biotechnology …, 2021 - Elsevier
Natural language processing (NLP) is a field of computer science concerned with automated
text and language analysis. In recent years, following a series of breakthroughs in deep and …

Evolutionary-scale prediction of atomic-level protein structure with a language model

Z Lin, H Akin, R Rao, B Hie, Z Zhu, W Lu, N Smetanin… - Science, 2023 - science.org
Recent advances in machine learning have leveraged evolutionary information in multiple
sequence alignments to predict protein structure. We demonstrate direct inference of full …

Large language models generate functional protein sequences across diverse families

A Madani, B Krause, ER Greene, S Subramanian… - Nature …, 2023 - nature.com
Deep-learning language models have shown promise in various biotechnological
applications, including protein design and engineering. Here we describe ProGen, a …

[PDF][PDF] Language models of protein sequences at the scale of evolution enable accurate structure prediction

Z Lin, H Akin, R Rao, B Hie, Z Zhu, W Lu… - BioRxiv, 2022 - biorxiv.org
Large language models have recently been shown to develop emergent capabilities with
scale, going beyond simple pattern matching to perform higher level reasoning and …

DeepLoc 2.0: multi-label subcellular localization prediction using protein language models

V Thumuluri, JJ Almagro Armenteros… - Nucleic acids …, 2022 - academic.oup.com
The prediction of protein subcellular localization is of great relevance for proteomics
research. Here, we propose an update to the popular tool DeepLoc with multi-localization …

High-resolution de novo structure prediction from primary sequence

R Wu, F Ding, R Wang, R Shen, X Zhang, S Luo, C Su… - BioRxiv, 2022 - biorxiv.org
Recent breakthroughs have used deep learning to exploit evolutionary information in
multiple sequence alignments (MSAs) to accurately predict protein structures. However …

Promptaid: Prompt exploration, perturbation, testing and iteration using visual analytics for large language models

A Mishra, U Soni, A Arunkumar, J Huang… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have gained widespread popularity due to their ability to
perform ad-hoc Natural Language Processing (NLP) tasks with a simple natural language …

Language models enable zero-shot prediction of the effects of mutations on protein function

J Meier, R Rao, R Verkuil, J Liu… - Advances in neural …, 2021 - proceedings.neurips.cc
Modeling the effect of sequence variation on function is a fundamental problem for
understanding and designing proteins. Since evolution encodes information about function …

Rethinking attention with performers

K Choromanski, V Likhosherstov, D Dohan… - arXiv preprint arXiv …, 2020 - arxiv.org
We introduce Performers, Transformer architectures which can estimate regular (softmax)
full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to …