Accurate proteome-wide missense variant effect prediction with AlphaMissense

J Cheng, G Novati, J Pan, C Bycroft, A Žemgulytė… - Science, 2023 - science.org
The vast majority of missense variants observed in the human genome are of unknown
clinical significance. We present AlphaMissense, an adaptation of AlphaFold fine-tuned on …

Mega-scale experimental analysis of protein folding stability in biology and design

K Tsuboyama, J Dauparas, J Chen, E Laine… - Nature, 2023 - nature.com
Advances in DNA sequencing and machine learning are providing insights into protein
sequences and structures on an enormous scale. However, the energetics driving folding …

Lynch syndrome, molecular mechanisms and variant classification

AB Abildgaard, SV Nielsen, I Bernstein, A Stein… - British journal of …, 2023 - nature.com
Patients with the heritable cancer disease, Lynch syndrome, carry germline variants in the
MLH1, MSH2, MSH6 and PMS2 genes, encoding the central components of the DNA …

Language models enable zero-shot prediction of the effects of mutations on protein function

J Meier, R Rao, R Verkuil, J Liu… - Advances in neural …, 2021 - proceedings.neurips.cc
Modeling the effect of sequence variation on function is a fundamental problem for
understanding and designing proteins. Since evolution encodes information about function …

Proteingym: Large-scale benchmarks for protein fitness prediction and design

P Notin, A Kollasch, D Ritter… - Advances in …, 2024 - proceedings.neurips.cc
Predicting the effects of mutations in proteins is critical to many applications, from
understanding genetic disease to designing novel proteins to address our most pressing …

Protst: Multi-modality learning of protein sequences and biomedical texts

M Xu, X Yuan, S Miret, J Tang - International Conference on …, 2023 - proceedings.mlr.press
Current protein language models (PLMs) learn protein representations mainly based on
their sequences, thereby well capturing co-evolutionary information, but they are unable to …

Is novelty predictable?

C Fannjiang, J Listgarten - Cold Spring Harbor …, 2024 - cshperspectives.cshlp.org
Machine learning–based design has gained traction in the sciences, most notably in the
design of small molecules, materials, and proteins, with societal applications ranging from …

Enhancing efficiency of protein language models with minimal wet-lab data through few-shot learning

Z Zhou, L Zhang, Y Yu, B Wu, M Li, L Hong… - Nature …, 2024 - nature.com
Accurately modeling the protein fitness landscapes holds great importance for protein
engineering. Pre-trained protein language models have achieved state-of-the-art …

Proteinnpt: Improving protein property prediction and design with non-parametric transformers

P Notin, R Weitzman, D Marks… - Advances in Neural …, 2023 - proceedings.neurips.cc
Protein design holds immense potential for optimizing naturally occurring proteins, with
broad applications in drug discovery, material design, and sustainability. However …

Poet: A generative model of protein families as sequences-of-sequences

T Truong Jr, T Bepler - Advances in Neural Information …, 2023 - proceedings.neurips.cc
Generative protein language models are a natural way to design new proteins with desired
functions. However, current models are either difficult to direct to produce a protein from a …