The TSIMMIS Approach to Mediation: Data Models and Languages
Journal of Intelligent Information Systems - Special issue: next generation information technologies and systems
EDUTELLA: a P2P networking infrastructure based on RDF
Proceedings of the 11th international conference on World Wide Web
Scaling Access to Heterogeneous Data Sources with DISCO
IEEE Transactions on Knowledge and Data Engineering
Querying Heterogeneous Information Sources Using Source Descriptions
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
MiniCon: A scalable algorithm for answering queries using views
The VLDB Journal — The International Journal on Very Large Data Bases
Efficiently Ordering Query Plans for Data Integration
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
The Piazza peer data management project
ACM SIGMOD Record
Completeness of integrated information sources
Information Systems - Special issue: Data quality in cooperative information systems
The Anatomy of the Grid: Enabling Scalable Virtual Organizations
International Journal of High Performance Computing Applications
A taxonomy of Data Grids for distributed data sharing, management, and processing
ACM Computing Surveys (CSUR)
Introduction to Operations Research and Revised CD-ROM 8
Introduction to Operations Research and Revised CD-ROM 8
Querying the internet with PIER
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Query processing over incomplete autonomous databases
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
SQLB: a query allocation framework for autonomous consumers and providers
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Journal on data semantics VIII
Query planning in the presence of overlapping sources
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Transactions on large-scale data- and knowledge-centered systems III
Hi-index | 0.00 |
This paper presents OptiSource, a novel approach of source selection that reduces the number of data sources accessed during query evaluation in large scale distributed data contexts. These contexts are typical of large scale Virtual Organizations (VO) where autonomous organizations share data about a group of domain concepts (e.g. patient, gene). The instances of such concepts are constructed from nondisjointed fragments provided by several local data sources. Such sources overlap in a non mastered way making data location uncertain. This fact, in addition to the absence of reliable statistics on source contents and the largenumber of sources, make current proposals unsuitable in terms of response quality and/or response time. OptiSource optimizes source selection by taking advantage of organizational aspects of VOs to predict the benefit of using a source. It uses an optimization model to distinguish the sets of sources that maximize benefits and minimize the number of sources to contact to while satisfying resource constraints. The precision and recall of source selection is highly improved as demonstrated by the tests performed with the OptiSource prototype.