作者
Ankita Srivastava, Atif Mahmood, Ritesh Srivastava
发表日期
2017/7
期刊
3rd International Conference on New Frontiers of Engineering, Science, Management and Humanities
卷号
6
页码范围
30-313
简介
Enzymes are proteins that catalyze bio-chemical reactions in different ways and play important roles in metabolic pathways. The exponential rise in sequences of new enzymes has necessitated developing methods that accurately predict their function. To address this problem, approaches have been applied, but are fail for dissimilar proteins that performs the same function. In this paper we present a machine learning approach to accurately predict the main function class of enzymes based on 41 sequence-derived features. Our features can be extracted using freely available online tools. Random Forest has been proven to be a very efficient data mining algorithm. Random Forest reported the best results with an overall accuracy of 72.5% and precision and recall in the range of 65% to 91% and 60% to 86% respectively. Our data sets were taken from PDB.
引用总数
学术搜索中的文章
A Srivastava, A Mahmood, R Srivastava - 3rd International Conference on New Frontiers of …, 2017