GridVine: An Infrastructure for Peer Information Management
IEEE Internet Computing
Self-organizing schema mappings in the GridVine peer data management system
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A DHT-based infrastructure for ad-hoc integration and querying of semantic data
IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
DObjects: enabling distributed data services for metacomputing platforms
Proceedings of the VLDB Endowment
idMesh: graph-based disambiguation of linked data
Proceedings of the 18th international conference on World wide web
Viewpoints on emergent semantics
Journal on Data Semantics VI
Proceedings of the 21st international conference on World Wide Web
Decentralized semantic coordination via belief propagation
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Large-scale linked data integration using probabilistic reasoning and crowdsourcing
The VLDB Journal — The International Journal on Very Large Data Bases
Semantic Web - Linked Data for science and education
Hi-index | 0.00 |
Until recently, most data integration techniques involved central components, e.g., global schemas, to enable transparent access to heterogeneous databases. Today, however, with the democratization of tools facilitating knowledge elicitation in machine-processable formats, one cannot rely on global, centralized schemas anymore as knowledge creation and consumption are getting more and more dynamic and decentralized. Peer Data Management Systems (PDMS) provide an answer to this problem by eliminating the central semantic component and considering instead compositions of local, pair-wise mappings to propagate queries from one database to the others. PDMS approaches proposed so far make the implicit assumption that all mappings used in this way are correct. This obviously cannot be taken as granted in typical PDMS settings where mappings can be created (semi) automatically by independent parties. In this work, we propose a totally decentralized, efficient message passing scheme to automatically detect erroneous mappings in PDMS. Our scheme is based on a probabilistic model where we take advantage of transitive closures of mapping operations to confront local belief on the correctness of a mapping against evidences gathered around the network. We show that our scheme can be efficiently embedded in any PDMS and provide a preliminary evaluation of our techniques on sets of both automatically-generated and real-world schemas.