Classification of Ligase Function Based on Multi-parametric Feature Extracted from Protein Sequence

  • Authors:
  • Bum Ju Lee;Heon Gyu Lee;Moon Sun Shin;Keun Ho Ryu

  • Affiliations:
  • Database/Bioinformatics Lab., Chungbuk National University, Chungbuk, South Korea 361-763;Database/Bioinformatics Lab., Chungbuk National University, Chungbuk, South Korea 361-763;Dept. of Computer Science, Konkuk University, Chungbuk, South Korea 380-701;Database/Bioinformatics Lab., Chungbuk National University, Chungbuk, South Korea 361-763

  • Venue:
  • ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the important goals of bioinformatics is to classify and predict the functions of proteins that have no sequence homolog of known functions. The purpose of this paper is to classify protein function by using multi-parametric feature, without sequence similarity. Firstly, we propose a method for generating novel features that present various local information of protein sequence based on positively and negatively charged residues. Then, we introduce a process of making optimal feature subset through combination of traditional and novel features extracted from protein sequence. Finally, we classify ligase enzymes by support vector machine (SVM). In experiment, only 375 out of 483 features were selected by feature selection, and the classification accuracy for 4thsub-classes in Enzyme Commission (EC) number is 98.35%. Our results demonstrate that most of novel features are valuable for specific enzyme function classification.