Querying XML Data using PC Cluster System

Authors:
Toshiyuki Amagasa;Kentarou Kido;Hiroyuki Kitagawa
Affiliations:
University of Tsukuba, Japan;University of Tsukuba, Japan;University of Tsukuba, Japan
Venue:
DEXA '07 Proceedings of the 18th International Conference on Database and Expert Systems Applications
Year:
2007

Citing 0
Cited 2

GMX: an XML data partitioning scheme for holistic twig joins

Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
XML data partitioning strategies to improve parallelism in parallel holistic twig joins

Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a novel approach for querying large-scale XML data using PC cluster system. With the recent spread of the XML format, large-scale data coded in XML ranging from several hundreds of megabytes to several gigabytes has become common. However, XML databases are often innefficient in dealing with huge XML data. The problem is the complexity of the XML data model and query processing. To cope with this problem, we attempt to construct a parallel XML database on top of a PC cluster system. To this end, we discuss XML data partitioning to enable parallel processing of XML queries. We introduce a path-based partitioning for XML data. The obtained XML fragments are then allocated to cluster nodes. To obtain cost-efficient allocation of the fragments, we discuss cost functions for parallel XPath processing and an algorithm to compute pseudo-optimal allocation, which is based on the well-known genetic algorithm. Finally, we demonstrate effectiveness of the proposed scheme by a series of experiments.