Efficient memory representation of XML documents

Authors:
Giorgio Busatto;Markus Lohrey;Sebastian Maneth
Affiliations:
Department für Informatik, Universität Oldenburg, Germany;FMI, Universität Stuttgart, Germany;Faculté I & C, EPFL, Switzerland
Venue:
DBPL'05 Proceedings of the 10th international conference on Database Programming Languages
Year:
2005

Citing 22
Cited 22

An algorithm for optimal lambda calculus reduction

POPL '90 Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Tree languages

Handbook of formal languages, vol. 3
XMill: an efficient compressor for XML data

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Approximating the smallest grammar: Kolmogorov complexity in natural models

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Approximation algorithms for grammar-based compression

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Algorithms on Compressed Strings and Arrays

SOFSEM '99 Proceedings of the 26th Conference on Current Trends in Theory and Practice of Informatics on Theory and Practice of Informatics
Testing Equivalence of Morphisms on Context-Free Languages

ESA '94 Proceedings of the Second Annual European Symposium on Algorithms
Query Evaluation on Compressed Trees (Extended Abstract)

LICS '03 Proceedings of the 18th Annual IEEE Symposium on Logic in Computer Science
Efficient Lossless Compression of Trees and Graphs

DCC '96 Proceedings of the Conference on Data Compression
Typechecking for XML transformers

Journal of Computer and System Sciences - Special issue on PODS 2000
Application of Lempel--Ziv factorization to the approximation of grammar-based compression

Theoretical Computer Science
XPRESS: a queriable compression for XML data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
XGRIND: A Query-Friendly XML Compressor

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
A Succinct Physical Storage Scheme for Efficient Evaluation of Path Queries in XML

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
XBench Benchmark and Performance Testing of XML DBMSs

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Succinct ordinal trees with level-ancestor queries

SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Imperfect XML: Rants, Raves, Tips, and Tricks ... from an Insider

Imperfect XML: Rants, Raves, Tips, and Tricks ... from an Insider
Vectorizing and Querying Large XML Repositories

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Path queries on compressed XML

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
XQueC: pushing queries to compressed XML data

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Implementing XQuery 1.0: the Galax experience

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
XML goes native: run-time representations for XTATIC

CC'05 Proceedings of the 14th international conference on Compiler Construction

The complexity of tree automata and XPath on grammar-compressed trees

Theoretical Computer Science - Implementation and application of automata
XQueC: A query-conscious compressed XML database

ACM Transactions on Internet Technology (TOIT)
Querying and maintaining a compact XML storage

Proceedings of the 16th international conference on World Wide Web
Bulk data in main memory-based XQuery evaluation

XIME-P '07 Proceedings of the 4th international workshop on XQuery implementation, experience and perspectives
Engineering succinct DOM

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
XML Storage and Processing on Mobile Devices

WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
2LP: A double-lazy XML parser

Information Systems
Unification with Singleton Tree Grammars

RTA '09 Proceedings of the 20th International Conference on Rewriting Techniques and Applications
How to edit gigabyte XML files on a mobile phone with XAS, RefTrees, and RAXS

Proceedings of the 5th Annual International Conference on Mobile and Ubiquitous Systems: Computing, Networking, and Services
CSC: supporting queries on compressed cached XML

ADC '09 Proceedings of the Twentieth Australasian Conference on Australasian Database - Volume 92
Searchable compression of office documents by XML schema subtraction

XSym'10 Proceedings of the 7th international XML database conference on Database and XML technologies
Unification and matching on compressed terms

ACM Transactions on Computational Logic (TOCL)
Congruence closure of compressed terms in polynomial time

FroCoS'11 Proceedings of the 8th international conference on Frontiers of combining systems
Updates on grammar-compressed XML data

BNCOD'11 Proceedings of the 28th British national conference on Advances in databases
Automata for positive core XPath queries on compressed documents

LPAR'06 Proceedings of the 13th international conference on Logic for Programming, Artificial Intelligence, and Reasoning
Stratified context unification is NP-complete

IJCAR'06 Proceedings of the Third international joint conference on Automated Reasoning
Fast equality test for straight-line compressed strings

Information Processing Letters
DTD-driven structure preserving XML compression

BNCOD'06 Proceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling
Bounded second-order unification is NP-complete

RTA'06 Proceedings of the 17th international conference on Term Rewriting and Applications
Tree automata and XPath on compressed trees

CIAA'05 Proceedings of the 10th international conference on Implementation and Application of Automata
Data management for mobile Ajax web 2.0 applications

DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Fast multi-update operations on compressed XML data

BNCOD'13 Proceedings of the 29th British National conference on Big Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Implementations that load XML documents and give access to them via, e.g., the DOM, suffer from huge memory demands: the space needed to load an XML document is usually many times larger than the size of the document. A considerable amount of memory is needed to store the tree structure of the XML document. Here a technique is presented that allows to represent the tree structure of an XML document in an efficient way. The representation exploits the high regularity in XML documents by “compressing” their tree structure; the latter means to detect and remove repetitions of tree patterns. The functionality of basic tree operations, like traversal along edges, is preserved in the compressed representation. This allows to directly execute queries (and in particular, bulk operations) without prior decompression. For certain tasks like validation against an XML type or checking equality of documents, the representation allows for provably more efficient algorithms than those running on conventional representations.