Prediction of the O-glycosylation sites in protein by layered neural networks and support vector machines

  • Authors:
  • Ikuko Nishikawa;Hirotaka Sakamoto;Ikue Nouno;Takeshi Iritani;Kazutoshi Sakakibara;Masahiro Ito

  • Affiliations:
  • College of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan;College of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan;College of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan;College of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan;College of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan;College of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan

  • Venue:
  • KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

O-glycosylation is one of the main types of the mammalian protein glycosylation, which is serine or threonine specific, though any consensus sequence is still unknown. In this paper, a layered neural network and a support vector machine are used for the prediction of O-glycosylation sites. Three types of encoding for a protein sequence within a fixed size window are used as the input to the network, that is, a sparse coding which distinguishes all 20 amino acid residues, 5-letter coding and hydropathy coding. In the neural network, one output unit gives the prediction whether a particular site of serine or threonine is glycosylated, while SVM classifies into the 2 classes. The performance is evaluated by the Matthews correlation coefficient. The preliminary results on the neural network show the better performance of the sparse and 5-letter codings compared with the hydropathy coding, while the improvement according to the window size is shown to be limited to a certain extent by SVM.