Fusionplex: resolution of data inconsistencies in the integration of heterogeneous information sources

  • Authors:
  • Amihai Motro;Philipp Anokhin

  • Affiliations:
  • Department of Information and Software Engineering, George Mason University, University Drive, Fairfax, VA 22030-4444, USA;Department of Information and Software Engineering, George Mason University, University Drive, Fairfax, VA 22030-4444, USA

  • Venue:
  • Information Fusion
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Fusionplex is a system for integrating multiple heterogeneous and autonomous information sources that uses data fusion to resolve factual inconsistencies among the individual sources. To accomplish this, the system relies on source features, which are meta-data on the merits of each information source; for example, the recentness of the data, its accuracy, its availability, or its cost. The fusion process is controlled with several parameters: (1) with a vector of feature weights, each user defines an individual notion of data utility; (2) with thresholds of acceptance, users ensure minimal performance of their data, excluding from the fusion process data that are too old, too costly, or lacking in authority, or numeric data that are too high, too low, or obvious outliers; and, ultimately, (3) in naming a particular fusion function to be used for each attribute (for example, average, maximum, or simply any) users implement their own interpretation of fusion. Several simple extensions to SQL are all that is needed to allow users to state these resolution parameters, thus ensuring that the system is easy to use. Altogether, Fusionplex provides its users with powerful and flexible, yet simple, control over the fusion process. In addition, Fusionplex supports other critical integration requirements, such as information source heterogeneity, dynamic evolution of the information environment, quick ad-hoc integration, and intermittent source availability. The methods described in this paper were implemented in a prototype system that provides complete Web-based integration services for remote clients.