Designing integration flows using hypercubes

  • Authors:
  • Kevin Wilkinson;Alkis Simitsis

  • Affiliations:
  • HP Labs, Palo Alto, CA;HP Labs, Palo Alto, CA

  • Venue:
  • Proceedings of the 14th International Conference on Extending Database Technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The design and implementation of an ETL (extract-transform-load) process for a data warehouse proceeds from a conceptual model to a logical model, and then a physical model and implementation. The conceptual model conveys at a high level the data sources and targets, and the transformation steps from sources to targets. The current state of the art is to express the conceptual model informally using text descriptions and diagrams. This makes the process of deriving a logical model time-consuming and error-prone. Our work is towards a system that covers the whole ETL lifecycle by injecting several layers of optimization and validation throughout the whole process starting with the business level objectives and ending with flow execution. In this paper, we focus on the ETL conceptual layer and present a solution that assists consultants in their task of defining the needs and requirements at the early stages of an integration project. We present a conceptual model for ETL based on hypercubes and hypercube operations. This is a formal model that captures the semantics of ETL at a high-level but that can also be machine-translated into a logical model for ETL. The use of hypercubes at the conceptual level renders a design that can be easily understood by business users and so reduces design and development time and produces a result that accurately captures service level agreements and business requirements.