Machine learning-guided protein engineering

P Kouba, P Kohout, F Haddadi, A Bushuiev… - ACS …, 2023 - ACS Publications
Recent progress in engineering highly promising biocatalysts has increasingly involved
machine learning methods. These methods leverage existing experimental and simulation …

Machine learning for functional protein design

P Notin, N Rollins, Y Gal, C Sander, D Marks - Nature biotechnology, 2024 - nature.com
Recent breakthroughs in AI coupled with the rapid accumulation of protein sequence and
structure data have radically transformed computational protein design. New methods …

Evolutionary-scale prediction of atomic-level protein structure with a language model

Z Lin, H Akin, R Rao, B Hie, Z Zhu, W Lu, N Smetanin… - Science, 2023 - science.org
Recent advances in machine learning have leveraged evolutionary information in multiple
sequence alignments to predict protein structure. We demonstrate direct inference of full …

ProtGPT2 is a deep unsupervised language model for protein design

N Ferruz, S Schmidt, B Höcker - Nature communications, 2022 - nature.com
Protein design aims to build novel proteins customized for specific purposes, thereby
holding the potential to tackle many environmental and biomedical problems. Recent …

Genome-wide prediction of disease variant effects with a deep protein language model

N Brandes, G Goldman, CH Wang, CJ Ye, V Ntranos - Nature Genetics, 2023 - nature.com
Predicting the effects of coding variants is a major challenge. While recent deep-learning
models have improved variant effect prediction accuracy, they cannot analyze all coding …

Efficient evolution of human antibodies from general protein language models

BL Hie, VR Shanker, D Xu, TUJ Bruun… - Nature …, 2024 - nature.com
Natural evolution must explore a vast landscape of possible sequences for desirable yet
rare mutations, suggesting that learning from natural evolutionary strategies could guide …

Proteingym: Large-scale benchmarks for protein fitness prediction and design

P Notin, A Kollasch, D Ritter… - Advances in …, 2024 - proceedings.neurips.cc
Predicting the effects of mutations in proteins is critical to many applications, from
understanding genetic disease to designing novel proteins to address our most pressing …

Protst: Multi-modality learning of protein sequences and biomedical texts

M Xu, X Yuan, S Miret, J Tang - International Conference on …, 2023 - proceedings.mlr.press
Current protein language models (PLMs) learn protein representations mainly based on
their sequences, thereby well capturing co-evolutionary information, but they are unable to …

Graph denoising diffusion for inverse protein folding

K Yi, B Zhou, Y Shen, P Liò… - Advances in Neural …, 2024 - proceedings.neurips.cc
Inverse protein folding is challenging due to its inherent one-to-many mapping
characteristic, where numerous possible amino acid sequences can fold into a single …

xTrimoPGLM: unified 100B-scale pre-trained transformer for deciphering the language of protein

B Chen, X Cheng, P Li, Y Geng, J Gong, S Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Protein language models have shown remarkable success in learning biological information
from protein sequences. However, most existing models are limited by either autoencoding …