Advances in DNA sequencing and machine learning are providing insights into protein sequences and structures on an enormous scale. However, the energetics driving folding …
Patients with the heritable cancer disease, Lynch syndrome, carry germline variants in the MLH1, MSH2, MSH6 and PMS2 genes, encoding the central components of the DNA …
J Meier, R Rao, R Verkuil, J Liu… - Advances in neural …, 2021 - proceedings.neurips.cc
Modeling the effect of sequence variation on function is a fundamental problem for understanding and designing proteins. Since evolution encodes information about function …
Predicting the effects of mutations in proteins is critical to many applications, from understanding genetic disease to designing novel proteins to address our most pressing …
Current protein language models (PLMs) learn protein representations mainly based on their sequences, thereby well capturing co-evolutionary information, but they are unable to …
Machine learning–based design has gained traction in the sciences, most notably in the design of small molecules, materials, and proteins, with societal applications ranging from …
Z Zhou, L Zhang, Y Yu, B Wu, M Li, L Hong… - Nature …, 2024 - nature.com
Accurately modeling the protein fitness landscapes holds great importance for protein engineering. Pre-trained protein language models have achieved state-of-the-art …
Protein design holds immense potential for optimizing naturally occurring proteins, with broad applications in drug discovery, material design, and sustainability. However …
T Truong Jr, T Bepler - Advances in Neural Information …, 2023 - proceedings.neurips.cc
Generative protein language models are a natural way to design new proteins with desired functions. However, current models are either difficult to direct to produce a protein from a …