Arithmetic coding for data compression
Communications of the ACM
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Communications of the ACM
Automatic inference of models for statistical code compression
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
XMill: an efficient compressor for XML data
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Syntax of Programming Languages: Theory and Practice
Syntax of Programming Languages: Theory and Practice
MOS '96 Selected Presentations and Invited Papers Second International Workshop on Mobile Object Systems - Towards the Programmable Internet
XPRESS: a queriable compression for XML data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Compressing XML with Multiplexed Hierarchical PPM Models
DCC '01 Proceedings of the Data Compression Conference
XGRIND: A Query-Friendly XML Compressor
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
XML compression techniques: A survey and comparison
Journal of Computer and System Sciences
Searchable compression of office documents by XML schema subtraction
XSym'10 Proceedings of the 7th international XML database conference on Database and XML technologies
Updates on grammar-compressed XML data
BNCOD'11 Proceedings of the 28th British national conference on Advances in databases
Hi-index | 0.00 |
We propose a scheme for automatically generating compressors for XML documents from Document Type Definition(DTD) specifications. Our algorithm is a lossless adaptive algorithm where the model used for compression and decompression is generated automatically from the DTD, and is used in conjunction with an arithmetic compressor to produce a compressed version of the document. The structure of the model mirrors the syntactic specification of the document. Our compression scheme is on-line, that is, it can compress the document as it is being read. We have implemented the compressor generator, and provide the results of experiments on some large XML databases whose DTD's are specified. We note that the average compression is better than that of XMLPPM, the only other on-line tool we are aware of. The tool is able to compress massive documents where XMLPPM failed to work as it ran out of memory. We believe the main appeal of this technique is the fact that the underlying model is so simple and yet so effective.