Bulkloading and maintaining XML documents

Authors:
Albrecht Schmidt;Martin Kersten
Affiliations:
CWI, NL-1090 GB Amsterdam;CWI, NL-1090 GB Amsterdam
Venue:
Proceedings of the 2002 ACM symposium on Applied computing
Year:
2002

Citing 11
Cited 1

Change detection in hierarchically structured information

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A query language and optimization techniques for unstructured data

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Meaningful change detection in structured data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Query automata

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Storing semistructured data with STORED

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
On supporting containment queries in relational database management systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Querying XML Documents Made Easy: Nearest Concept Queries

Proceedings of the 17th International Conference on Data Engineering
Relational Databases for Querying XML Documents: Limitations and Opportunities

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Efficient Relational Storage and Retrieval of XML Documents

Selected papers from the Third International Workshop WebDB 2000 on The World Wide Web and Databases
XML and Object-Relational Database Systems - Enhancing Structural Mappings Based on Statistics

Selected papers from the Third International Workshop WebDB 2000 on The World Wide Web and Databases
MIL primitives for querying a fragmented world

The VLDB Journal — The International Journal on Very Large Data Bases

Integrated querying of XML data in RDBMSs

Proceedings of the 2003 ACM symposium on Applied computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The popularity of XML as a exchange and storage format brings about massive amounts of documents to be stored, maintained and analyzed --- a challenge that traditionally has been tackled with Database Management Systems (DBMS). To open up the content of XML documents to analysis with declarative query languages, efficient bulk loading techniques are necessary.Database technology has traditionally been offering support for these tasks but yet falls short of providing efficient automation techniques for the challenges that large collections of XML data raise. As storage back-end, many applications rely on relational databases, which are designed towards large data volumes. This paper studies the bulk load and update algorithms for XML data stored in relational format and outlines opportunities and problems. We investigate both (1) bulk insertion and deletion as well as (2) updates in the form of edit scripts which heavily use pointer-chasing techniques which often are considered orthogonal to the algebraic operations relational databases are optimized for. To get the most out of relational database systems, we show that one should make careful use of edit scripts and replace them with bulk operations if more than very small portion of the database is updated.We implemented our ideas on top of the Monet Database System and benchmarked their performance.