Non-parametric classification of protein secondary structures

  • Authors:
  • Elias Zintzaras;Nigel P. Brown;Axel Kowald

  • Affiliations:
  • Department of Biomathematics, University of Thessaly School of Medicine, Papakyriazi 22, Larisa 41222, Greece;Biomedical Informatics Unit, Imperial Cancer Research Fund, London, UK;Max Planck Institute for Molecular Genetics, Berlin, Germany

  • Venue:
  • Computers in Biology and Medicine
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Proteins were classified into their families using a classification tree method which is based on the coefficient of variations of physico-chemical and geometrical properties of the secondary structures of proteins. The tree method uses as splitting criterion the increase in purity when a node is split into two subnodes and the size of the tree is controlled by a threshold level for the improvement of the apparent misclassification rate (AMR) of the tree after each splitting step. The classification tree method seems effective in reproducing similar structural groupings as the method of dynamic programming. For comparison, we also used another two methods: neural networks and support vector machines. We could show that the presented classification tree method performs better in classifying proteins into their families. The presented algorithm might be suitable for a rapid preliminary classification of proteins into their corresponding families.