The derivation problem of summary data
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Framework for query optimization in distributed statistical databases
Information and Software Technology
The TSIMMIS Approach to Mediation: Data Models and Languages
Journal of Intelligent Information Systems - Special issue: next generation information technologies and systems
IEEE Transactions on Pattern Analysis and Machine Intelligence
Optimal and efficient integration of heterogeneous summary tables in a distributed database
Data & Knowledge Engineering
Semantic integration of heterogeneous information sources
Data & Knowledge Engineering - Special issue on heterogeneous information resources need semantic access
Reconciling schemas of disparate data sources: a machine-learning approach
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Advances in Distributed and Parallel Knowledge Discovery
Advances in Distributed and Parallel Knowledge Discovery
An Evidential Reasoning Approach to Attribute Value Conflict Resolution in Database Integration
IEEE Transactions on Knowledge and Data Engineering
Aggregation of Imprecise and Uncertain Information in Databases
IEEE Transactions on Knowledge and Data Engineering
Designing a Kernel for Data Mining
IEEE Expert: Intelligent Systems and Their Applications
A Scalable Approach to Integrating Heterogeneous Aggregate Views of Distributed Databases
IEEE Transactions on Knowledge and Data Engineering
A New Algorithm for Learning Parameters of a Bayesian Network from Distributed Data
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Clustering classifiers for knowledge discovery from physically distributed databases
Data & Knowledge Engineering
Knowledge discovery by probabilistic clustering of distributed databases
Data & Knowledge Engineering
Semantic-integration research in the database community
AI Magazine - Special issue on semantic integration
Efficient Classification across Multiple Database Relations: A CrossMine Approach
IEEE Transactions on Knowledge and Data Engineering
GRAONTO: A graph-based approach for automatic construction of domain ontology
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
In statistical databases and data warehousing applications it is commonly the case that aggregate views are maintained as an underlying mechanism for summarising information. Where the databases or applications are distributed, or arise from independent data collections or system developments, there may be incompatibility, heterogeneity, and data inconsistency. These challenges need to be overcome if federations of aggregated databases are to be successfully incorporated into systems for database management, querying, retrieval, and knowledge discovery.In this paper we address the issue of integrating aggregate views that have semantically heterogeneous classification schemes. In previous work we have developed a methodology that is efficient but that cannot easily handle data inconsistencies. Our previous approach is therefore not particularly well-suited to very large databases or federations of large numbers of databases. We now address these scalability issues by introducing a methodology for heterogeneous aggregate view integration that constructs a dynamic shared ontology to which each of the aggregate views can be explicitly related. A maximum likelihood technique, implemented using the EM (Expectation-Maximisation) algorithm, is used to inherently handle data inconsistencies in the computation of integrated aggregates that are described in terms of the dynamic shared ontology.