Cl-GBI: a novel approach for extracting typical patterns from graph-structured data

  • Authors:
  • Phu Chien Nguyen;Kouzou Ohara;Hiroshi Motoda;Takashi Washio

  • Affiliations:
  • The Institute of Scientific and Industrial Research, Osaka University, Ibaraki, Osaka, Japan;The Institute of Scientific and Industrial Research, Osaka University, Ibaraki, Osaka, Japan;The Institute of Scientific and Industrial Research, Osaka University, Ibaraki, Osaka, Japan;The Institute of Scientific and Industrial Research, Osaka University, Ibaraki, Osaka, Japan

  • Venue:
  • PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Graph-Based Induction (GBI) is a machine learning technique developed for the purpose of extracting typical patterns from graph-structured data by stepwise pair expansion (pair-wise chunking). GBI is very efficient because of its greedy search strategy, however, it suffers from the problem of overlapping subgraphs. As a result, some of typical patterns cannot be discovered by GBI though a beam search has been incorporated in an improved version of GBI called Beam-wise GBI (B-GBI). In this paper, improvement is made on the search capability by using a new search strategy, where frequent pairs are never chunked but used as pseudo nodes in the subsequent steps, thus allowing extraction of overlapping subgraphs. This new algorithm, called Cl-GBI (Chunkingless GBI), was tested against two datasets, the promoter dataset from UCI repository and the hepatitis dataset provided by Chiba University, and shown successful in extracting more typical patterns than B-GBI.