Supporting efficient query processing on compressed XML files

  • Authors:
  • Yongjing Lin;Youtao Zhang;Quanzhong Li;Jun Yang

  • Affiliations:
  • University of Texas at Dallas, Richardson, TX;University of Texas at Dallas, Richardson, TX;IBM Almaden Research Center, San Jose, CA;University of California at Riverside, Riverside, CA

  • Venue:
  • Proceedings of the 2005 ACM symposium on Applied computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML has been widely accepted as the de facto format for data representation and exchange. However, it is also known for the excessive information redundancy in its representation. While various compression schemes have been proposed and some of them can support query processing over compressed files, it is usually inevitable to perform partial (or full) data decompression which is expensive and in some cases may dominate the query processing time.In this paper, we propose a new XML compression scheme based on the Sequitur compression algorithm. By organizing the compression result as a set of context free grammar rules, the scheme supports efficient processing of XPath queries without decompression. The experimental results show that this scheme achieves comparable compression ratio as gzip while its query processing time is among the best of existing algorithms.