Mining significant tree patterns in carbohydrate sugar chains

  • Authors:
  • Kosuke Hashimoto;Ichigaku Takigawa;Motoki Shiga;Minoru Kanehisa;Hiroshi Mamitsuka

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • Bioinformatics
  • Year:
  • 2008

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: Carbohydrate sugar chains or glycans, the third major class of macromolecules, hold branch shaped tree structures. Glycan motifs are known to be two types: (1) conserved patterns called ‘cores’ containing the root and (2) ubiquitous motifs which appear in external parts including leaves and are distributed over different glycan classes. Finding these glycan tree motifs is an important issue, but there have been no computational methods to capture these motifs ef.ciently. Results: We have developed an ef.cient method for mining motifs or significant subtrees from glycans. The key contribution of this method is: (1) to have proposed a new concept, ‘á-closed frequent subtrees’, and an ef.cient method for mining all these subtrees from given trees and (2) to have proposed to apply statistical hypothesis testing to rerank the frequent subtrees in signi.cance. We experimentally veri.ed the effectiveness of the proposed method using real glycans: (1)We examined the top 10 subtrees obtained by our method at some parameter setting and con.rmed that all subtrees are significant motifs in glycobiology. (2) We applied the results of our method to a classi.cation problem and found that our method outperformed other competing methods, SVM with three different tree kernels, being all statistically significant. Contact: mami@kuicr.kyoto-u.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online.