Alternate Representation of Distance Matrices for Characterization of Protein Structure
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
PRIB '09 Proceedings of the 4th IAPR International Conference on Pattern Recognition in Bioinformatics
A comparative study of multi-classification methods for protein fold recognition
International Journal of Computational Intelligence in Bioinformatics and Systems Biology
A similarity network approach for the analysis and comparison of protein sequence/structure sets
Journal of Biomedical Informatics
A Study of Hierarchical and Flat Classification of Proteins
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A protein classifier based on SVM by using the voxel based descriptor
RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
FRAN and RBF-PSO as two components of a hyper framework to recognize protein folds
Computers in Biology and Medicine
Hi-index | 0.00 |
The classification of proteins based on their structure can play an important role in the deduction or discovery of protein function. However, the relatively low number of solved protein structures and the unknown relationship between structure and sequence requires an alternative method of representation for classification to be effective. Furthermore, the large number of potential folds causes problems for many classification strategies, increasing the likelihood that the classifier will reach a local optima while trying to distinguish between all of the possible structural categories. Here we present a hierarchical strategy for structural classificationthat first partitions proteins based on their SCOP class before attempting to assign a protein fold. Using a well-known dataset derived from the 27 most-populated SCOP folds and several sequence-based descriptor properties as input features, we test a number of classification-methods, including Na篓ýve Bayes and Boosted C4.5. Our strategy achieves an average fold recognition of 74%, which is significantly higher than the 56-60% previously reported in the literature, indicating the effectiveness of a multi-level approach.