Protein remote homology detection based on auto-cross covariance transformation
Computers in Biology and Medicine
Ensemble of diversely trained support vector machines for protein fold recognition
ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part I
Enhancing protein fold prediction accuracy using evolutionary and structural features
PRIB'13 Proceedings of the 8th IAPR international conference on Pattern Recognition in Bioinformatics
Hi-index | 3.84 |
Motivation: Fold recognition is an important step in protein structure and function prediction. Traditional sequence comparison methods fail to identify reliable homologies with low sequence identity, while the taxonomic methods are effective alternatives, but their prediction accuracies are around 70%, which are still relatively low for practical usage. Results: In this study, a simple and powerful method is presented for taxonomic fold recognition, which combines support vector machine (SVM) with autocross-covariance (ACC) transformation. The evolutionary information represented in the form of position-specific score matrices is converted into a series of fixed-length vectors by ACC transformation and these vectors are then input to a SVM classifier for fold recognition. The sequence-order effect can be effectively captured by this scheme. Experiments are performed on the widely used D-B dataset and the corresponding extended dataset, respectively. The proposed method, called ACCFold, gets an overall accuracy of 70.1% on the D-B dataset, which is higher than major existing taxonomic methods by 2–14%. Furthermore, the method achieves an overall accuracy of 87.6% on the extended dataset, which surpasses major existing taxonomic methods by 9–17%. Additionally, our method obtains an overall accuracy of 80.9% for 86-folds and 77.2% for 199-folds. These results demonstrate that the ACCFold method provides the state-of-the-art performance for taxonomic fold recognition. Availability: The source code for ACC transformation is freely available at http://www.iipl.fudan.edu.cn/demo/accpkg.html. Contact: sgzhou@fudan.edu.cn Supplementary information:Supplementary data are available at Bioinformatics online.