PrefIndex: an efficient supergraph containment search technique

  • Authors:
  • Gaoping Zhu;Xuemin Lin;Wenjie Zhang;Wei Wang;Haichuan Shang

  • Affiliations:
  • The University of New South Wales, Sydney, NSW, Australia;The University of New South Wales, Sydney, NSW, Australia;The University of New South Wales, Sydney, NSW, Australia;The University of New South Wales, Sydney, NSW, Australia;The University of New South Wales, Sydney, NSW, Australia

  • Venue:
  • SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Graphs are prevailingly used in many applications to model complex data structures. In this paper, we study the problem of super-graph containment search. To avoid the NP-complete subgraph isomorphism test, most existing works follow the filtering-verification framework and select graph-features to build effective indexes, which filter false results (graphs) before conducting the costly verification. However, searching features multiple times in the query graphs yields huge redundant computation, which leads to the emergence of the computation-sharing framework. This paper follows the roadmap of computation-sharing framework to efficiently process supergraph containment queries. Firstly, database graphs are clustered into disjoint groups for sharing the computation cost within each group. While it is shown NP-hard to maximize the computation-sharing benefits of a clustering, efficient algorithm is developed to approximate the optimal solution with an approximation factor of 1/2. A novel prefix-sharing indexing technique, PrefIndex, is then proposed based on which efficient query processing algorithm integrating both filtering and verification is developed. Finally, PrefIndex is enhanced with multi-level sharing and suffix-sharing to further avoid redundant computation. An extensive empirical study demonstrates the efficiency and scalability of our techniques which achieve orders of magnitudes of speed-up against the state-of-the-art techniques.