XMill: an efficient compressor for XML data
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Unbounded length contexts for PPM
DCC '95 Proceedings of the Conference on Data Compression
DCC '02 Proceedings of the Data Compression Conference
Compressing XML with Multiplexed Hierarchical PPM Models
DCC '01 Proceedings of the Data Compression Conference
Variable-length contexts for PPM
DCC '04 Proceedings of the Conference on Data Compression
Revisiting dictionary-based compression: Research Articles
Software—Practice & Experience
Compressing and searching XML data via two zips
Proceedings of the 15th international conference on World Wide Web
Using structural contexts to compress semistructured text collections
Information Processing and Management: an International Journal
Effective asymmetric XML compression
Software—Practice & Experience
Combining efficient XML compression with query processing
ADBIS'07 Proceedings of the 11th East European conference on Advances in databases and information systems
Hi-index | 0.00 |
Contemporary XML documents can be tens of megabytes long, and reducing their size, thus allowing to transfer them faster, poses a significant advantage for their users. In this paper, we describe a new XML compression scheme which outperforms the previous state-of-the-art algorithm, SCMPPM, by over 9% on average in compression ratio, having the practical feature of streamlined decompression and being almost twice faster in the decompression. Applying the scheme can significantly reduce transmission time/bandwidth usage for XML documents published on the Web. The proposed scheme is based on a semi-dynamic dictionary of the most frequent words in the document (both in the annotation and contents), automatic detection and compact encoding of numbers and specific patterns (like dates or IP addresses), and a back-end PPM coding variant tailored to efficiently handle long matching sequences. Moreover, we show that the compression ratio can be improved by additional 9% for the price of a significant slow-down.