Duplicate candidate elimination and fast support calculation for frequent subgraph mining

  • Authors:
  • Andrés Gago-Alonso;Jesús Ariel Carrasco-Ochoa;José Eladio Medina-Pagola;José Fco. Martínez-Trinidad

  • Affiliations:
  • Advanced Technologies Application Center, La Habana, Cuba and National Institute of Astrophysics, Optics and Electronics, Puebla, Mexico;National Institute of Astrophysics, Optics and Electronics, Puebla, Mexico;Advanced Technologies Application Center, La Habana, Cuba;National Institute of Astrophysics, Optics and Electronics, Puebla, Mexico

  • Venue:
  • IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Frequent connected subgraph mining (FCSM) is an interesting task with wide applications in real life. Most of the previous studies are focused on pruning search subspaces or optimizing the subgraph isomorphism (SI) tests. In this paper, a new property to remove all duplicate candidates in FCSM during the enumeration is introduced. Based on this property, a new FCSM algorithm called gdFil is proposed. In our proposal, the candidate space does not contain duplicates; therefore, we can use a fast evaluation strategy for reducing the cost of SI tests without wasting memory resources. Thus, we introduce a data structure to reduce the cost of SI tests. The performance of our algorithm is compared against other reported algorithms.