Efficient algorithms based on relational queries to mine frequent graphs

  • Authors:
  • Walter Garcia;Carlos Ordonez;Kai Zhao;Ping Chen

  • Affiliations:
  • University of Houston - Downtown, Houston, TX, USA;University of Houston, Houston, TX, USA;University of Houston, Houston, TX, USA;University of Houston - Downtown, Houston, TX, USA

  • Venue:
  • PIKM '10 Proceedings of the 3rd workshop on Ph.D. students in information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Frequent subgraph mining is an important problem in data mining with wide application in science. For instance, graphs can be used to represent structural relationships in problems related to network topology, chemical compound, protein structures, and so on. Searching for patterns from graph databases is difficult since graph-related operations generally have higher time complexity than equivalent operations on frequent itemsets. From a practical standpoint, databases keep growing with lots of opportunities and need to mine graphs. Even though there is a significant body of work on graph mining, most techniques work outside the database system. Programming frequent graph mining in SQL is more difficult than traditional approaches because the graph must be represented as a table and algorithmic steps must be written as relational queries. In our research, we study three fundamental problems under a database approach: graph storage and indexing, frequent subgraph search, and identifying subgraph isomorphism. We outline main research issues and our solution towards solving them. We also present preliminary experimental validation focusing on query optimizations and time complexity.