Comparative Analysis of XML Compression Technologies

  • Authors:
  • Wilfred Ng;Wai-Yeung Lam;James Cheng

  • Affiliations:
  • Department of Computer Science, The Hong Kong University of Science and Technology;Department of Computer Science, The Hong Kong University of Science and Technology;Department of Computer Science, The Hong Kong University of Science and Technology

  • Venue:
  • World Wide Web
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML provides flexibility in publishing and exchanging heterogeneous data on the Web. However, the language is by nature verbose and thus XML documents are usually larger in size than other specifications containing the same data content. It is natural to expect that the data size will continue to grow as XML data proliferates on the Web. The size problem of XML documents hinders the applications of XML, since it substantially increases the costs of storing, processing and exchanging the data. The hindrance is more apparent in bandwidth- and memory-limited settings such as those applications related to mobile communication.In this paper, we survey a range of recently proposed XML specific compression technologies and study their efforts and capabilities to overcome the size problem. First, by categorizing XML compression technologies into queriable and unqueriable compressors, we explain the efforts in the representative technologies that aim at utilizing the exposed structure information from the input XML documents. Second, we discuss the importance of queriable XML compressors and assess whether the compressed XML documents generated from these technologies are able to support direct querying on XML data. Finally, we present a comparative analysis of the state-of-the-art XML conscious compression technologies in terms of compression ratio, compression and decompression times, memory consumption, and query performance.