A compressor for effective archiving, retrieval, and updating of XML documents

  • Authors:
  • Jun-Ki Min;Myung-Jae Park;Chin-Wan Chung

  • Affiliations:
  • Korea University of Technology and Education, Chungnam, Korea;Korea Advanced Institute of Science and Technology, Daejeon, Korea;Korea Advanced Institute of Science and Technology, Daejeon, Korea

  • Venue:
  • ACM Transactions on Internet Technology (TOIT)
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Like HTML, many XML documents are resident on native file systems. Since XML data is irregular and verbose, the disk space and the network bandwidth are wasted. To overcome the verbosity problem, research on compressors for XML data has been conducted. Some XML compressors do not support querying compressed data, while other XML compressors which support querying compressed data blindly encode tags and data values using predefined encoding methods. Existing XML compressors do not provide the facility for updates on compressed XML data.In this article, we propose XPRESS, an XML compressor which supports direct updates and efficient evaluations of queries on compressed XML data. XPRESS adopts a novel encoding method called reverse arithmetic encoding, which encodes label paths of XML data and applies diverse encoding methods depending on the types of data values. Experimental results with real-life data sets show that XPRESS achieves significant improvements on query performance for compressed XML data and reasonable compression ratios. On average, the query performance of XPRESS is 2.13 times better than that of an existing XML compressor, and the compression ratio of XPRESS is about 71%. Additionally, we demonstrate the efficiency of the updates performed directly on compressed XML data.