gSpan: Graph-Based Substructure Pattern Mining

  • Authors:
  • Xifeng Yan;Jiawei Han

  • Affiliations:
  • -;-

  • Venue:
  • ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

We investigate new approaches for frequent graph-basedpattern mining in graph datasets and propose a novel algorithmcalled gSpan (graph-based Substructure pattern mining),which discovers frequent substructures without candidategeneration. gSpan builds a new lexicographic orderamong graphs, and maps each graph to a unique minimumDFS code as its canonical label. Based on this lexico-graphicorder, gSpan adopts the depth-first search strategyto mine frequent connected subgraphs efficiently. Our performancestudy shows that gSpan substantially outperformsprevious algorithms, sometimes by an order of magnitude.