Parallel XML processing by work stealing

  • Authors:
  • Wei Lu;Dennis Gannon

  • Affiliations:
  • Indiana University, Bloomington, IN;Indiana University, Bloomington, IN

  • Venue:
  • Proceedings of the 2007 workshop on Service-oriented computing performance: aspects, issues, and approaches
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

A language for semi-structured documents, XML has emerged as the core of the web services architecture, and is playing crucial roles in messaging systems, databases, and document processing. However, the processing of XML documents has been regarded as the performance bottleneck in most systems and applications. On the other side, the multicore processor, emerged as a solution for the clock-speed limitation of the modern CPUs, has been growingly prevalent. Leveraging the parallelism provided by the multicorere source to speedup the software execution is becoming the trend of the software development. In this paper, we present a parallel processing model for the XML document. The model is not designed just for a specific XML processing task, instead, it is a general model, by which we are able to explore various parallel XML document processing. The kernel of the model is a stealing-based dynamic load-balancing mechanism, called ThreadCrew, by which multiple threads are able to process the disjointed parts of the XML document in parallel with balanced load distribution. The model also provides a novel mechanism to trace the stealing actions, thus the equivalent sequential result can be gotten by gluing the multiple parallel-running results together. To show the feasibility and effectiveness of our approaches, we present our C# implementation of parallel XML serialization in this paper. Our empirical study shows our parallel XML serialization algorithm can improved the XML serializing performance significantly on a multicore machine.