A Scalable Approach to Integrating Heterogeneous Aggregate Views of Distributed Databases

Authors:
Sally McClean;Bryan Scotney;Kieran Greer
Affiliations:
-;-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2003

Citing 4
Cited 7

A universal-scheme approach to statistical databases containing homogeneous summary tables

ACM Transactions on Database Systems (TODS)
Dataset descriptions and results

Machine learning, neural and statistical classification
Optimal and efficient integration of heterogeneous summary tables in a distributed database

Data & Knowledge Engineering
Aggregation of Imprecise and Uncertain Information in Databases

IEEE Transactions on Knowledge and Data Engineering

Database aggregation of imprecise and uncertain evidence

Information Sciences—Informatics and Computer Science: An International Journal - special issue: Knowledge discovery from distributed information sources
Metadata with a MISSION: using metadata to query distributed statistical meta-information systems

DCMI '03 Proceedings of the 2003 international conference on Dublin Core and metadata applications: supporting communities of discourse and practice---metadata research & applications
Integrating semantically heterogeneous aggregate views of distributed databases

Distributed and Parallel Databases
Knowledge discovery from semantically heterogeneous aggregate databases using model-based clustering

BNCOD'07 Proceedings of the 24th British national conference on Databases
An evidential approach to integrating semantically heterogeneous distributed databases

BNCOD'06 Proceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling
Interoperability and integration of independent heterogeneous distributed databases over the internet

BNCOD'06 Proceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling
Evidential integration of semantically heterogeneous aggregates in distributed databases with imprecision

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Aggregate views are commonly used for summarizing information held in very large databases such as those encountered in data warehousing, large scale transaction management, and statistical databases. Such applications often involve distributed databases that have developed independently and therefore may exhibit incompatibility, heterogeneity, and data inconsistency. We are here concerned with the integration of aggregates that have heterogeneous classification schemes where local ontologies, in the form of such classification schemes, may be mapped onto a common ontology. In previous work, we have developed a method for the integration of such aggregates; the method previously developed is efficient, but cannot handle innate data inconsistencies that are likely to arise when a large number of databases are being integrated. In this paper, we develop an approach that can handle data inconsistencies and is thus inherently much more scalable. In our new approach, we first construct a dynamic shared ontology by analyzing the correspondence graph that relates the heterogeneous classification schemes; the aggregates are then derived by minimization of the Kullback-Leibler information divergence using the EM (Expectation-Maximization) algorithm. Thus, we may assess whether global queries on such aggregates are answerable, partially answerable, or unanswerable in advance of computing the aggregates themselves.