Combining fuzzy information from multiple systems (extended abstract)
PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Exploiting IP multicast in content-based publish-subscribe systems
IFIP/ACM International Conference on Distributed systems platforms
Minimization of tree pattern queries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Query Merging: Improving Query Subscription Processing in a Multicast Environment
IEEE Transactions on Knowledge and Data Engineering
Efficient Filtering of XML Documents for Selective Dissemination of Information
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Estimating the Selectivity of XML Path Expressions for Internet Scale Applications
Proceedings of the 27th International Conference on Very Large Data Bases
Content-Based Networking: A New Communication Infrastructure
IMWS '01 Revised Papers from the NSF Workshop on Developing an Infrastructure for Mobile and Wireless Systems
Efficient Filtering of XML Documents with XPath Expressions
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Scalable Filtering of XML Data for Web Services
IEEE Internet Computing
The many faces of publish/subscribe
ACM Computing Surveys (CSUR)
Knowledge and Information Systems
Efficient algorithms for processing XPath queries
ACM Transactions on Database Systems (TODS)
An efficient subscription routing algorithm for scalable XML-based publish/subscribe systems
Journal of Systems and Software
Efficient xml data dissemination with piggybacking
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Reasoning about XML update constraints
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Chained forests for fast subsumption matching
Proceedings of the 2007 inaugural international conference on Distributed event-based systems
Data & Knowledge Engineering
Path queries on compressed XML
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Estimating the output cardinality of partial preaggregation with a measure of clusteredness
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Towards an internet-scale XML dissemination service
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Stream firewalling of xml constraints
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Dynamic content-based channels: meeting in the middle
Proceedings of the second international conference on Distributed event-based systems
Fast track article: Dynamic filter merging and mergeability detection for publish/subscribe
Pervasive and Mobile Computing
Reasoning about XML update constraints
Journal of Computer and System Sciences
Efficient algorithms for descendant-only tree pattern queries
Information Systems
Efficient algorithms for the tree homeomorphism problem
DBPL'07 Proceedings of the 11th international conference on Database programming languages
Semantic peer-to-peer overlays for publish/subscribe networks
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Hi-index | 0.00 |
With the rapid growth of XML-document traffic on the Internet, scalable content-based dissemination of XML documents to a large, dynamic group of consumers has become an important research challenge. To indicate the type of content that they are interested in, data consumers typically specify their subscriptions using some XML pattern specification language (e.g., XPath). Given the large volume of subscribers, system scalability and efficiency mandate the ability to aggregate the set of consumer subscriptions to a smaller set of content specifications, so as to both reduce their storage-space requirements as well as speed up the document-subscription matching process. In this paper, we provide the first systematic study of subscription aggregation where subscriptions are specified with tree patterns (an important subclass of XPath expressions). The main challenge is to aggregate an input set of tree patterns into a smaller set of generalized tree patterns such that: (1) a given space constraint on the total size of the subscriptions is met, and (2) the loss in precision (due to aggregation) during document filtering is minimized. We propose an efficient tree-pattern aggregation algorithm that makes effective use of document-distribution statistics in order to compute a precise set of aggregate tree patterns within the allotted space budget. As part of our solution, we also develop several novel algorithms for tree-pattern containment and minimization, as well as "least-upper-bound" computation for a set of tree patterns. These results are of interest in their own right, and can prove useful in other domains, such as XML query optimization. Extensive results from a prototype implementation validate our approach.