Enforcing strictness in integration of dimensions: beyond instance matching

Authors:
Dariush Riazati;James A. Thom;Xiuzhen Zhang
Affiliations:
RMIT University, Melbourne, Australia;RMIT University, Melbourne, Australia;RMIT University, Melbourne, Australia
Venue:
Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP
Year:
2011

Citing 24
Cited 1

Consistent query answers in inconsistent databases

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Updating OLAP dimensions

Proceedings of the 2nd ACM international workshop on Data warehousing and OLAP
Flexible Relation: An Approach for Integrating Data from Multiple, Possibly Inconsistent Databases

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Scalar Aggregation in FD-Inconsistent Databases

ICDT '01 Proceedings of the 8th International Conference on Database Theory
Algorithms for Mining Distance-Based Outliers in Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Extending Practical Pre-Aggregation in On-Line Analytical Processing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Reducing Inconsistency in Integrating Data From Different Sources

IDEAS '01 Proceedings of the International Database Engineering & Applications Symposium
Consistency in Data Warehouse Dimensions

IDEAS '02 Proceedings of the 2002 International Symposium on Database Engineering & Applications
STORM: A Statistical Object Representation Model

Proceedings of the 5th International Conference SSDBM on Statistical and Scientific Database Management
Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Detecting duplicate objects in XML documents

Proceedings of the 2004 international workshop on Information quality in information systems
An analysis of additivity in OLAP systems

Proceedings of the 7th ACM international workshop on Data warehousing and OLAP
DogmatiX tracks down duplicates in XML

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Matching knowledge elements in concept maps using a similarity flooding algorithm

Decision Support Systems
Eliminating fuzzy duplicates in data warehouses

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Two approaches to the integration of heterogeneous data warehouses

Distributed and Parallel Databases
Towards Relational Inconsistent Databases with Functional Dependencies

KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part II
Efficient Consistent Query Answering Based on Attribute Deletions

CSA '08 Proceedings of the International Symposium on Computer Science and its Applications
Resolution-Aware Query Answering for Business Intelligence

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
A survey on summarizability issues in multidimensional modeling

Data & Knowledge Engineering
Discovering concept mappings by similarity propagation among substructures

IDEAL'10 Proceedings of the 11th international conference on Intelligent data engineering and automated learning
Consistent query answering: five easy pieces

ICDT'07 Proceedings of the 11th international conference on Database Theory
A taxonomy of inaccurate summaries and their management in OLAP systems

ER'05 Proceedings of the 24th international conference on Conceptual Modeling
Project-Join-Repair: an approach to consistent query answering under functional dependencies

FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems

DOLAP 2011: overview of the 14th international workshop on data warehousing and olap

Proceedings of the 20th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Maintaining strictness in dimensions is important in integration of data warehouses. A dimension that satisfies all of its roll-up constraints is said to be strict, a property that is required for correct aggregation. Existing work on instance matching does not address the problem of enforcing the strictness of roll-up constraints. In this paper, we use a graph matching-based approach to dimension instance matching and propose an algorithm that enforces strictness and reduces false positives. Making use of similarity flooding, the graph matching algorithm can be greedy in identifying matching members, we propose heuristics to further reduce false positive matches and reduce false strictness. Experiments on real-world data demonstrates the effectiveness of our proposed approach.