The Computer Journal
A Study of Methods for Systematically Abbreviating English Words and Names
Journal of the ACM (JACM)
XMill: an efficient compressor for XML data
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Compressing XML with Multiplexed Hierarchical PPM Models
DCC '01 Proceedings of the Data Compression Conference
Representing Trees of Higher Degree
Algorithmica
Comparative Analysis of XML Compression Technologies
World Wide Web
XQueC: A query-conscious compressed XML database
ACM Transactions on Internet Technology (TOIT)
XMark: a benchmark for XML data management
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Schema Independent XML Compressor
International Journal of Information Retrieval Research
Hi-index | 0.00 |
Lossy compression techniques have been applied to image and text compression, yielding compression factors that are vastly superior to lossless compression schemes. In this paper, we present a preliminary study on a set of lossy transformations for XML documents that preserve the semantics. Inspired by previous techniques, e.g. lossy text compression and literate programming, we apply a simple algorithm to XML syntactic constructs to loose superfluous layout information and redundant text. The obtained XML keeps the human-readability and machine-readability properties. Additionally, it can lead to a considerable reduction of its space occupancy and boost the application of conventional text compressors, thus representing a promising technology for several data management tasks.