Graph summarization for indexing paths in graph-structured data

  • Authors:
  • Shriraghav Kaushik;Jeffrey F. Naughton

  • Affiliations:
  • -;-

  • Venue:
  • Graph summarization for indexing paths in graph-structured data
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the rapidly increasing popularity of XML for data representation, there is a lot of interest in query processing over data that conforms to a labeled-tree or labeled-graph data model. An important component in querying such data involves traversing the labeled-graph by forming path expressions. This thesis studies auxiliary data structures known as path indexes that are intended to speed up the evaluation of path expressions. The approach adopted towards path indexing in this thesis is to construct a smaller summary graph and then evaluate path expressions over this smaller graph. We first describe formally why this problem of path indexing is different from the traditional indexing problem in database systems. Next, we propose an index specification framework that can be used to define a wide variety of path indexes, each covering a different set of path expressions. The techniques used in this framework are such that for a large class of index specifications, the resulting path index is the smallest index that is suitable for the respective set of path expressions. These techniques are based on the notion of graph bisimilarity. We then study how XML queries can be processed in a native XML database management system using path indexes in conjunction with inverted lists. We also analyze how this integration of path indexing and inverted lists can be used to answer information retrieval style relevance-based queries. The algorithms we obtain in this context have the property of instance optimality, a notion of optimality recently introduced by Fagin et al. in the published literature. Finally, in the last part of this thesis, we study how the path indexes we propose can be maintained as the underlying data changes.