Cysteine oxidation in proteins: structure, biophysics, and simulation

D Garrido Ruiz, A Sandoval-Perez, AV Rangarajan… - Biochemistry, 2022 - ACS Publications
Cysteine side chains can exist in distinct oxidation states depending on the pH and redox
potential of the environment, and cysteine oxidation plays important yet complex regulatory …

Evolutionary-scale prediction of atomic-level protein structure with a language model

Z Lin, H Akin, R Rao, B Hie, Z Zhu, W Lu, N Smetanin… - Science, 2023 - science.org
Recent advances in machine learning have leveraged evolutionary information in multiple
sequence alignments to predict protein structure. We demonstrate direct inference of full …

Learning inverse folding from millions of predicted structures

C Hsu, R Verkuil, J Liu, Z Lin, B Hie… - International …, 2022 - proceedings.mlr.press
We consider the problem of predicting a protein sequence from its backbone atom
coordinates. Machine learning approaches to this problem to date have been limited by the …

Protein structure generation via folding diffusion

KE Wu, KK Yang, R van den Berg, S Alamdari… - Nature …, 2024 - nature.com
The ability to computationally generate novel yet physically foldable protein structures could
lead to new biological discoveries and new treatments targeting yet incurable diseases …

Fast and flexible protein design using deep graph neural networks

A Strokach, D Becerra, C Corbi-Verge, A Perez-Riba… - Cell systems, 2020 - cell.com
Protein structure and function is determined by the arrangement of the linear sequence of
amino acids in 3D space. We show that a deep graph neural network, ProteinSolver, can …

Score-based generative modeling for de novo protein design

JS Lee, J Kim, PM Kim - Nature Computational Science, 2023 - nature.com
The generation of de novo protein structures with predefined functions and properties
remains a challenging problem in protein design. Diffusion models, also known as score …

Julia for biologists

E Roesch, JG Greener, AL MacLean, H Nassar… - Nature …, 2023 - nature.com
Major computational challenges exist in relation to the collection, curation, processing and
analysis of large genomic and imaging datasets, as well as the simulation of larger and …

Language models generalize beyond natural proteins

R Verkuil, O Kabeli, Y Du, BIM Wicky, LF Milles… - BioRxiv, 2022 - biorxiv.org
Learning the design patterns of proteins from sequences across evolution may have
promise toward generative protein design. However it is unknown whether language …

Bilingual language model for protein sequence and structure

M Heinzinger, K Weissenow, JG Sanchez, A Henkel… - bioRxiv, 2023 - biorxiv.org
Advanced Artificial Intelligence (AI) enabled large language models (LLMs) to revolutionize
Natural Language Processing (NLP). Their adaptation to protein sequences spawned the …

Simulating 500 million years of evolution with a language model

T Hayes, R Rao, H Akin, NJ Sofroniew, D Oktay, Z Lin… - bioRxiv, 2024 - biorxiv.org
More than three billion years of evolution have produced an image of biology encoded into
the space of natural proteins. Here we show that language models trained on tokens …