Compile-time minimisation of load imbalance in loop nests
ICS '97 Proceedings of the 11th international conference on Supercomputing
On supporting containment queries in relational database management systems
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
XOO7: applying OO7 benchmark to XML query processing tool
Proceedings of the tenth international conference on Information and knowledge management
Holistic twig joins: optimal XML pattern matching
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Parallel Processing XML Documents
IDEAS '02 Proceedings of the 2002 International Symposium on Database Engineering & Applications
ViST: a dynamic index method for querying XML data by tree structures
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Structural Joins: A Primitive for Efficient XML Query Pattern Matching
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
On the integration of structure indexes and inverted lists
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Efficient processing of XML twig patterns with parent child edges: a look-ahead approach
Proceedings of the thirteenth ACM international conference on Information and knowledge management
On boosting holism in XML twig pattern matching using structural indexing techniques
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
WIN: An E.cient Data Placement Strategy for Parallel XML Databases
ICPADS '05 Proceedings of the 11th International Conference on Parallel and Distributed Systems - Volume 01
Processing XPath Queries in PC-Clusters Using XML Data Partitioning
ICDEW '06 Proceedings of the 22nd International Conference on Data Engineering Workshops
Efficient Query Processing for Large XML Data in Distributed Environments
AINA '07 Proceedings of the 21st International Conference on Advanced Networking and Applications
XMark: a benchmark for XML data management
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Querying XML Data using PC Cluster System
DEXA '07 Proceedings of the 18th International Conference on Database and Expert Systems Applications
Efficiently Querying Large XML Data Repositories: A Survey
IEEE Transactions on Knowledge and Data Engineering
GMX: an XML data partitioning scheme for holistic twig joins
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Executing parallel TwigStack algorithm on a multi-core system
Proceedings of the 11th International Conference on Information Integration and Web-based Applications & Services
Hi-index | 0.00 |
Parallel XML query processing systems that process numerous queries over large heterogeneous XML documents often experience under-performance due to workload imbalance and low CPU/system utilization, because conventional partitioning strategies cannot serve well for state-of-the-art query processing algorithms, such as holistic twig joins. Consequently, partitioning and distributing heterogeneous XML documents onto a parallel cluster system have lead to such an intricacy issue for maintaining good query performance. In this paper, we propose XML data partitioning strategies that are able to alleviate system performance degradation due to workload imbalance, especially for parallel holistic twig joins processing. The proposed XML data partitioning strategies aim at improving workload balance on both static data distribution and dynamic data distribution. In the first strategy we refine an XML partition having a high cost by series of XML data partition refinements with various levels of granularities from document, query, and subquery, up to node streams. The selection of the granularity level for refining a high cost partition is contextually dependent on the overall workload balance in the system. In the second strategy for dynamic data distribution, we dynamically handle low system utilization when there are many idle nodes in the system. We propose an XML data redistribution approach by partitioning XML data on the fly at the stream nodes-based granularity.