State driven semantic modeling of operators in ETL workflow

  • Authors:
  • Wesley Deneke;Wingning Li;Craig Thompson

  • Affiliations:
  • University of Arkansas, Fayetteville, AR;University of Arkansas, Fayetteville, AR;University of Arkansas, Fayetteville, AR

  • Venue:
  • Journal of Computing Sciences in Colleges
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

To process the flood of digital age data, ETL tools operating on grids have provided organizations with the ability to efficiently filter, clean, and persist very large data sets by means of complex workflows. Currently, however, constructing such workflows is largely manual, human time intensive, and error prone. Existing models omit the domain related knowledge necessary to validate the structure of such large, complex ETL workflows during construction. This paper introduces the concepts of preconditions, postconditions, and field abstraction on ETL operators to provide a richer model for ETL workflow that can leverage relevant domain knowledge to represent and enforce operator constraints.