The frequencies of oppositely charged, uncharged polar, and β-branched amino acids determine proteins' thermostability

S Sun, C Ao, D Wang, B Dong - IEEE Access, 2020 - ieeexplore.ieee.org
S Sun, C Ao, D Wang, B Dong
IEEE Access, 2020ieeexplore.ieee.org
Enhancing proteins' thermostability is an important aspect of enzyme engineering. Many
studies have investigated the properties that determine the proteins' thermostability.
However, no consensus has emerged. To understand the mechanisms underlying the high
thermostability of thermophilic proteins, we evaluated the relative importance of the amino
acid frequencies in protein sequences for discriminating thermophilic and non-thermophilic
proteins based on machine learning algorithms together with a three-step feature selection …
Enhancing proteins' thermostability is an important aspect of enzyme engineering. Many studies have investigated the properties that determine the proteins' thermostability. However, no consensus has emerged. To understand the mechanisms underlying the high thermostability of thermophilic proteins, we evaluated the relative importance of the amino acid frequencies in protein sequences for discriminating thermophilic and non-thermophilic proteins based on machine learning algorithms together with a three-step feature selection procedure and a principal component (PC) analysis to remove noisy and redundant information. Our results showed that the frequencies of oppositely charged amino acids, i.e., Lys and Glu, were higher in thermophilic proteins, suggesting that electrostatic interactions are fundamentally important for protein stabilization at high temperatures. Further, we found that the frequencies of uncharged polar amino acids, which are thermolabile or actively interact with water molecules, were lower in thermophilic proteins. Moreover, the frequencies of β-branched aliphatic amino acids tended to increase with increasing thermostability. Overall, these results suggest that proteins' thermostability is determined by a few protein features, which were well captured by the first two PCs. A classifier based on only the first two PCs achieved a high accuracy of 90%, suggesting that our classifier could be an effective and efficient tool for engineering stable proteins.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果