Protein secondary structure prediction using distance based classifiers

  • Authors:
  • Ashish Ghosh;Bijnan Parai

  • Affiliations:
  • Machine Intelligence Unit, Indian Statistical Institute, 203 B.T. Road, Kolkata 700108, India;Machine Intelligence Unit, Indian Statistical Institute, 203 B.T. Road, Kolkata 700108, India

  • Venue:
  • International Journal of Approximate Reasoning
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

De novo structure determination of proteins is a significant research issue of bioinformatics. Biochemical procedures for protein structure determination are costly. Use of different pattern classification techniques are proved to ease this task. In this article, the secondary structure prediction task has been mapped into a three-class problem of pattern classification, where the classes are helix, sheet and coil. Here we have made an attempt to analyze this secondary structure prediction problem using three distance based classifiers (minimum distance, K-nearest neighbor and fuzzy K-nearest neighbor). The only information about the proteins used is the primary structure (sequence of amino acids) itself. A matrix-based new representation of such categorical data is used to convert the sequence into real numbers. A comparative study among these classifiers has been made based on some standard classification performance measures. From this study, it is found that the simple minimum distance classifier performs better compared to others.