WebContent: efficient P2P Warehousing of web data
Proceedings of the VLDB Endowment
Routing of structured queries in large-scale distributed systems
Proceedings of the 2008 ACM workshop on Large-Scale distributed systems for information retrieval
Optimized union of non-disjoint distributed data sets
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
LCA-based selection for XML document collections
Proceedings of the 19th international conference on World wide web
Selectivity-based XML query processing in structured peer-to-peer networks
Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Cardinality estimation and dynamic length adaptation for Bloom filters
Distributed and Parallel Databases
Towards large-scale sharing of electronic health records of cancer patients
Proceedings of the 1st ACM International Health Informatics Symposium
ASTERIX: towards a scalable, semistructured data platform for evolving-world models
Distributed and Parallel Databases
Collaborative clustering of XML documents
Journal of Computer and System Sciences
Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
FoXtrot: Distributed structural and value XML filtering
ACM Transactions on the Web (TWEB)
ViP2P: efficient XML management in DHT networks
ICWE'12 Proceedings of the 12th international conference on Web Engineering
Web data indexing in the cloud: efficiency and cost reductions
Proceedings of the 16th International Conference on Extending Database Technology
A new tool for sharing and querying of clinical documents modeled using HL7 Version 3 standard
Computer Methods and Programs in Biomedicine
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
We study the scalable management of XML data in P2P networks based on distributed hash tables (DHTs). We identify performance limitations in this context, and propose an array of techniques to lift them. First, we adapt the DHT platform's index store and communication primitives to the needs of massive data processing. Second, we introduce a distributed hierarchical index and associated efficient algorithms to speed up query processing. Third, we present an innovative, XML-specific flavor of Bloom filters, to reduce data transfers entailed by query processing. Our approach is fully implemented in the KadoP system, used in a real-life software manufacturing application. Our experiments demonstrate the benefits of the proposed techniques.