Path bitmap indexing for retrieval of XML documents

  • Authors:
  • Jae-Min Lee;Byung-Yeon Hwang

  • Affiliations:
  • Department of Computer Engineering, Catholic University, Korea;Department of Computer Engineering, Catholic University, Korea

  • Venue:
  • MDAI'06 Proceedings of the Third international conference on Modeling Decisions for Artificial Intelligence
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The path-based indexing methods such as the three-dimensional bitmap indexing have been used for collecting and retrieving the similar XML documents. To do this, the paths become the fundamental unit for constructing index. In case the document structure changes, the path extracted before the change and the one after the change are regarded as totally different ones regardless of the degree of the change. Due to this, the performance of the path-based indexing methods is usually bad in retrieving and clustering the documents which are similar. A novel method which can detect the similar paths is needed for the effective collecting and retrieval of XML documents. In this paper, a new path construction similarity which calculates the similarity between the paths is defined and a path bitmap indexing method is proposed to effectively load and extract the similar paths. The proposed method extracts the representative path from the paths which are similar. The paths are clustered using this, and the XML documents are also clustered using the clustered paths. This solves the problem of existing three-dimensional bitmap indexing. Through the performance evaluation, the proposed method showed better clustering accuracy over existing methods.