Answering heterogeneous database queries with degrees of uncertainty
Distributed and Parallel Databases
Communications of the ACM
Merging techniques for performing data fusion on the web
Proceedings of the tenth international conference on Information and knowledge management
Data Quality for the Information Age
Data Quality for the Information Age
IEEE Transactions on Knowledge and Data Engineering
Resolving Attribute Incompatibility in Database Integration: An Evidential Reasoning Approach
Proceedings of the Tenth International Conference on Data Engineering
Data Quality in Web Information Systems
ER '02 Proceedings of the 21st International Conference on Conceptual Modeling
Metasearch: data fusion for document retrieval
Metasearch: data fusion for document retrieval
Provider issues in quality-constrained data provisioning
Proceedings of the 2nd international workshop on Information quality in information systems
Brokering infrastructure for minimum cost data procurement based on quality-quantity models
Decision Support Systems
ACM Computing Surveys (CSUR)
Subsumption and complementation as data fusion operators
Proceedings of the 13th International Conference on Extending Database Technology
A solution of data inconsistencies in data integration: designed for pervasive computing environment
Journal of Computer Science and Technology
Preference-driven querying of inconsistent relational databases
EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
Prioritized repairing and consistent query answering in relational databases
Annals of Mathematics and Artificial Intelligence
Tackling incompleteness in information extraction --- a complementarity approach
ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Hi-index | 0.00 |
A virtual database system is software that provides unified access to multiple information sources. If the sources are overlapping in their contents and independently maintained, then the likelihood of inconsistent answers is high. Solutions are often based on ranking (which sorts the different answers according to recurrence) and on fusion (which synthesizes a new value from the different alternatives according to a specific formula). In this paper we argue that both methods are flawed, and we offer alternative solutions that are based on knowledge about the performance of the source data; including features such as recentness, availability, accuracy and cost. These features are combined in a flexible utility function that expresses the overall value of a data item to the user. Utility allows us to (1) define meaningful ranking on the inconsistent set of answers, and offer the topranked answer as a preferred answer; (2) determine whether a fusion value is indeed better than the initial values, by calculating its utility and comparing it to the utilities of the initial values; and (3) discover the best fusion: the fusion formula that optimizes the utility. The advantages of such performance-based and utility-driven ranking and fusion are considerable.