XML processing in DHT networks

  • Authors:
  • Serge Abiteboul;Ioana Manolescu;Neoklis Polyzotis;Nicoleta Preda;Chong Sun

  • Affiliations:
  • INRIA Futurs&University of Paris XI, Gemo Team, 4 rue Jacques Monod, Orsay Cedex, 91893, France. serge.abiteboul@inria.fr;INRIA Futurs&University of Paris XI, Gemo Team, 4 rue Jacques Monod, Orsay Cedex, 91893, France. ioana.manolescu@inria.fr;Computer Science Departament, University of California, Santa Cruz, 1156 High St, Santa Cruz, CA 95064, United States. alkis@cs.ucsc.edu;INRIA Futurs&University of Paris XI, Gemo Team, 4 rue Jacques Monod, Orsay Cedex, 91893, France. nicoleta.preda@inria.fr;Computer Science Departament, University of California, Santa Cruz, 1156 High St, Santa Cruz, CA 95064, United States. sunchong@soe.ucsc.edu

  • Venue:
  • ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study the scalable management of XML data in P2P networks based on distributed hash tables (DHTs). We identify performance limitations in this context, and propose an array of techniques to lift them. First, we adapt the DHT platform's index store and communication primitives to the needs of massive data processing. Second, we introduce a distributed hierarchical index and associated efficient algorithms to speed up query processing. Third, we present an innovative, XML-specific flavor of Bloom filters, to reduce data transfers entailed by query processing. Our approach is fully implemented in the KadoP system, used in a real-life software manufacturing application. Our experiments demonstrate the benefits of the proposed techniques.