Dissemination of heterogeneous XML data in publish/subscibe systems

Authors:
Yuan Ni;Chee Yong Chan
Affiliations:
IBM China Research Lab, Beijing, China;National University of Singapore, Singapore, Singapore
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 24
Cited 1

Automating the transformation of XML documents

Proceedings of the 3rd international workshop on Web information and data management
Schema-Driven Evaluation of Approximate Tree-Pattern Queries

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Using Schema Matching to Simplify Heterogeneous Data Translation

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficient Filtering of XML Documents for Selective Dissemination of Information

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Generic Schema Matching with Cupid

Proceedings of the 27th International Conference on Very Large Data Bases
Answering XML Queries on Heterogeneous Data Sources

Proceedings of the 27th International Conference on Very Large Data Bases
Efficient filtering of XML documents with XPath expressions

The VLDB Journal — The International Journal on Very Large Data Bases
Security Issues and Requirements for Internet-Scale Publish-Subscribe Systems

HICSS '02 Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS'02)-Volume 9 - Volume 9
Path sharing and predicate evaluation for high-performance XML filtering

ACM Transactions on Database Systems (TODS)
Constraint-based XML query rewriting for data integration

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
THALIA: Test Harness for the Assessment of Legacy Information Integration Approaches

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Scalable security and accounting services for content-based publish/subscribe systems

Proceedings of the 2005 ACM symposium on Applied computing
FiST: scalable XML document filtering by sequencing twig patterns

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Structure and content scoring for XML

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Schema matching for transforming structured documents

Proceedings of the 2005 ACM symposium on Document engineering
Mapping-driven XML transformation

Proceedings of the 16th international conference on World Wide Web
Massively multi-query join processing in publish/subscribe systems

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Translating web data

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
S-ToPSS: semantic Toronto publish/subscribe system

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Towards an internet-scale XML dissemination service

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Early profile pruning on XML-aware publish-subscribe systems

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Dissemination of heterogeneous xml data

Proceedings of the 17th international conference on World Wide Web
Schema-conscious filtering of XML documents

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
A semantic approach to query rewriting for integrated XML data

ER'05 Proceedings of the 24th international conference on Conceptual Modeling

Multiple keyword-based queries over XML streams

Proceedings of the 20th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

The publish-subscribe paradigm is an effective approach for data publishers to asynchronously disseminate relevant data to a large number of data subscribers. A lot of recent research has focused on extending this paradigm to support content-based delivery of XML data using more expressive XML-based subscription specifications that allow constraints on both data contents as well as structure. However, due to the heterogeneous data schemas used by different data publishers even for data in the same domain, an important challenge is how to efficiently and effectively disseminate relevant data to subscribers whose subscriptions might be specified based on schemas that are different from those used by the data publishers. In this paper, we examine the options to resolve this schema heterogeneity problem in XML data dissemination, and propose a novel paradigm that is based on data rewriting. Our experimental results demonstrate the effectiveness of the data rewriting paradigm and identifies the tradeoffs of the various approaches.