Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
The Piazza Peer Data Management System
IEEE Transactions on Knowledge and Data Engineering
XPath lookup queries in P2P networks
Proceedings of the 6th annual ACM international workshop on Web information and data management
Structure and value synopses for XML data graphs
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Locating data sources in large distributed systems
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Approximate XML query answers in DHT-based P2P networks
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Hi-index | 0.00 |
This paper addresses the problem of data placement, indexing, and querying large XML data repositories distributed over an existing P2P service infrastructure. Our architecture scales gracefully to the network and data sizes, is fully distributed, fault tolerant and self-organizing, and handles complex queries efficiently, even those queries that use full-text search. Our framework for indexing distributed XML data is based on both meta-data information and textual content. We introduce a novel data synopsis structure to summarize text that correlates textual with positional information and increases query routing precision. Our processing framework maps an XML query with full-text search into a distributed program that migrates from peer to peer, collecting relevant document locations along the way. In addition, we introduce methods to handle network updates, such as node arrivals, departures, and failures. Finally, we report on a prototype implementation, which is used to validate the accuracy of our data synopses and to analyze the various costs involved in indexing XML data and answering queries.