XMill: an efficient compressor for XML data
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Millau: an encoding format for efficient representation and exchange of XML over the Web
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Compressing XML with Multiplexed Hierarchical PPM Models
DCC '01 Proceedings of the Data Compression Conference
XGRIND: A Query-Friendly XML Compressor
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Indexing and querying XML using extended Dewey labeling scheme
Data & Knowledge Engineering
Link-based hidden attribute discovery for objects on Web
Proceedings of the 14th International Conference on Extending Database Technology
Hi-index | 0.00 |
In many XML applications, parsing is a key operation. When the processing involves modifying data, random access, and/or in an order different from the one in which elements are stored, a DOM parser has to be used. A major problem with using a DOM parser is memory consumption. The size of a DOM tree created from an XML document may be as large as 10 times of the size of the original document. Maintaining the tree of a big document requires a large amount of memory. It may cause costly swapping. In the worst cases, a DOM parser cannot handle a document at all because of its size. In this research, we develop a space efficient DOM parser, called SEDOM. It is based on a new compression approach and a set of manipulation algorithms, which enable many DOM operations to be performed when the data are in the compressed format, and allow individual parts of a document to be compressed, decompressed and manipulated. It can be used to efficiently manipulate very large XML documents. In this paper, we describe SEDOM, and compare its performance with three existing DOM parsers and an XML compressor.