A Scalable Algorithm for Answering Queries Using Views
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
The chatty web: emergent semantics through gossiping
WWW '03 Proceedings of the 12th international conference on World Wide Web
Representing and reasoning about mappings between domain models
Eighteenth national conference on Artificial intelligence
Efficient query reformulation in peer data management systems
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Completeness of integrated information sources
Information Systems - Special issue: Data quality in cooperative information systems
Logical foundations of peer-to-peer data integration
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Information integration in schema-based peer-to-peer networks
CAiSE'03 Proceedings of the 15th international conference on Advanced information systems engineering
NETB'07 Proceedings of the 3rd USENIX international workshop on Networking meets databases
Polymorphic queries for P2P systems
Information Systems
Hi-index | 0.00 |
Peer data management systems (PDMS) are a natural extension to integrated information systems. They consist of a dynamic set of autonomous peers, each of which can mediate between heterogenous schemas of other peers. A new data source joins a PDMS by defining a semantic mapping to one or more other peers, thus forming a network of peers. Queries submitted to a peer are answered with data residing at that peer and by data that is reached along paths of mappings through the network of peers. However, without optimization methods query reformulation in PDMS is very inefficient due to redundancy in mapping paths. We present a decentral strategy that guides peers in their decision along which further mappings the query should be sent. The strategy uses statistics of the peers own data and statistics of mappings to neighboring peers to predict whether it is worthwhile to send the query to that neighbor-- or whether the query plan should be pruned at this point. These decisions are guided by a benefit and cost model, trading off the amount of data a neighbor will pass back, and the execution cost of that step. Thus, we allow a high scale-up of PDMS in the number of participating peers.