Processing XPath Queries in PC-Clusters Using XML Data Partitioning

Authors:
Kentarou Kido;Toshiyuki Amagasa;Hiroyuki Kitagawa
Affiliations:
University of Tsukuba, Japan;University of Tsukuba, Japan;University of Tsukuba, Japan
Venue:
ICDEW '06 Proceedings of the 22nd International Conference on Data Engineering Workshops
Year:
2006

Citing 0
Cited 5

A document object modeling method to retrieve data from a very large XML document

Proceedings of the 2007 ACM symposium on Document engineering
GMX: an XML data partitioning scheme for holistic twig joins

Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
XML data partitioning strategies to improve parallelism in parallel holistic twig joins

Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication
Scaling XML query processing: distribution, localization and pruning

Distributed and Parallel Databases
The time-based data partitioning method for XML query optimization

Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, with the rapid spread of XML format, it has become popular that large-scale data, whose size range from several hundreds of MB to several GB, are described by XML. For the purpose of providing fast and reliable means for storage and retrieval of huge XML data, it is a reasonable choice for us to use XML databases. In fact, there are many ways to realize XML databases, but relational XML database, in that an XML data is mapped to relational tables and query processing is enabled in terms of SQL queries, is one of the most popular way to implement XML databases. However, some researchers have pointed out that the performance of relational XML databases degrades when dealing with such huge XML data. In this study, we propose a scheme for parallel processing of XML data using PC Clusters. First, we discuss how to decompose XML data so that we can perform parallel processing of XML queries. We give the definitions of vertical and horizontal decomposition of XML data based on decomposition of schema graph and XML instances, respectively. To allocate decomposed XML data to cluster nodes, we give an algorithm for computing pseudo-optimal assignment of XML fragments like greedy method in the light of XML query workload. Finally, we experimentally evaluate the effectiveness of the proposed method.