Pruning Strategies Based on the Upper Bound of Information Gain for Discriminative Subgraph Mining

  • Authors:
  • Kouzou Ohara;Masahiro Hara;Kiyoto Takabayashi;Hiroshi Motoda;Takashi Washio

  • Affiliations:
  • The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan 567-0047;The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan 567-0047;The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan 567-0047;The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan 567-0047;The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan 567-0047

  • Venue:
  • Knowledge Acquisition: Approaches, Algorithms and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a set of graphs with class labels, discriminative subgraphs appearing therein are useful to construct a classification model. A graph mining technique called Chunkingless Graph-Based Induction (Cl-GBI) can find such discriminative subgraphs from graph structured data. But, it sometimes happens that Cl-GBI cannot extract subgraphs that are good enough to characterize the given data due to its time and space complexities. Thus, to improve its efficiency, we propose pruning methods based on the upper-bound of information gain that is used as a criterion for discriminability of subgraphs in Cl-GBI. The upper-bound of information gain of a subgraph is the maximal one that its super graph can achieve. By comparing the upper-bound of each subgraph with the best information gain at the moment, Cl-GBI can exclude unfruitful subgraphs from its search space. Furthermore, we experimentally evaluate the effectiveness of the pruning methods on a real world and artificial datasets.