Prediction of subcellular localization of proteins using pairwise sequence alignment and support vector machine

  • Authors:
  • Jong Kyoung Kim;G. P. S. Raghava;Sung-Yang Bang;Seungjin Choi

  • Affiliations:
  • Department of Computer Science, Pohang University of Science and Technology, San 31 Hyoja-dong, Nam-gu, Pohang 790-784, Republic of Korea;Bioinformatics Centre, Institute of Microbial Technology, Sector 39A, Chandigarh, India;Department of Computer Science, Pohang University of Science and Technology, San 31 Hyoja-dong, Nam-gu, Pohang 790-784, Republic of Korea;Department of Computer Science, Pohang University of Science and Technology, San 31 Hyoja-dong, Nam-gu, Pohang 790-784, Republic of Korea

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2006

Quantified Score

Hi-index 0.10

Visualization

Abstract

Predicting the destination of a protein in a cell is important for annotating the function of the protein. Recent advances have allowed us to develop more accurate methods for predicting the subcellular localization of proteins. One of the most important factors for improving the accuracy of these methods is related to the introduction of new useful features for protein sequences. In this paper we present a new method for extracting appropriate features from the sequence data by computing pairwise sequence alignment scores. As a classifier, support vector machine (SVM) is used. The overall prediction accuracy evaluated by the jackknife validation technique reached 94.70% for the eukaryotic non-plant data set and 92.10% for the eukaryotic plant data set, which is the highest prediction accuracy among the methods reported so far with such data sets. Our experimental results confirm that our feature extraction method based on pairwise sequence alignment is useful for this classification problem.