Taxonomy-superimposed graph mining

  • Authors:
  • Ali Cakmak;Gultekin Ozsoyoglu

  • Affiliations:
  • Case Western Reserve University, Cleveland, OH;Case Western Reserve University, Cleveland, OH

  • Venue:
  • EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

New graph structures where node labels are members of hierarchically organized ontologies or taxonomies have become commonplace in different domains, e.g., life sciences. It is a challenging task to mine for frequent patterns in this new graph model which we call taxonomy-superimposed graphs, as there may be many patterns that are implied by the generalization/specialization hierarchy of the associated node label taxonomy. Hence, 'standard' graph mining techniques are not directly applicable. In this paper, we present Taxogram, a taxonomy-superimposed graph mining algorithm that can efficiently discover frequent graph structures in a database of taxonomy-superimposed graphs. Taxogram has two advantages: (i) It performs a subgraph isomorphism test once per class of patterns which are structurally isomorphic, but have different labels, and (ii) it reconciles standard graph mining methods with taxonomy-based graph mining and takes advantage of well-studied methods in the literature. Taxogram has three stages: (a) relabeling nodes in the input database, (b) mining pattern classes/families and constructing associated occurrence indices, and (c) computing patterns and eliminating useless (i.e., over-generalized) patterns by post-processing occurrence indices. Experimental results show that Taxogram is significantly more efficient and more scalable compared to other alternative approaches.