CGT Code-Based XML Data Compression Method

  • Authors:
  • Sheng Zhang;Sha Chen;Yuping Liang

  • Affiliations:
  • -;-;-

  • Venue:
  • ISECS '09 Proceedings of the 2009 Second International Symposium on Electronic Commerce and Security - Volume 02
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML is a de-facto standard for exchanging and presenting information on the Web. However, XML data is also recognized as verbose since it heavily inflates the size of the data due to the repeated tags and structures. The data verbosity problem gives rise to many challenges of conventional query processing and data exchange. Compression techniques are the important way to overcome the verbosity problem. According to the features of XML document, we put forward a new XML data compression method called CGTXDC which uses XML Schema to construct XML document tree about the structure information of XML document and adopts CGT code to encode each tree node for maintaining the structure of the original XML document. CGTXDC requires only a single pass over the input XML document during the compression process and don't need to build the document tree in the memory. The experimental results show much better compression ratio than that of representative XML compression methods, such as Xpress and Xgrind.