Automatic generation of data processing workflows for transportation modeling

  • Authors:
  • José Luis Ambite;Dipsy Kapoor

  • Affiliations:
  • University of Southern California, Marina del Rey, CA;University of Southern California, Marina del Rey, CA

  • Venue:
  • dg.o '07 Proceedings of the 8th annual international conference on Digital government research: bridging disciplines & domains
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Scientists, economists, and planners in government, industry and academia spend much of their time accessing, integrating, and analyzing data. However, many of their studies are one-of-a-kind with little sharing and reuse for subsequent endeavors. The Argos project seeks to improve the productivity of analysts by providing a framework that encourages reuse of data sources and data processing operations, and by developing tools to generate data processing workflows. In this paper, we present an approach to automatically generate data processing workflows. First, we define a methodology for assigning formal semantics to data and operations according to a domain ontology, which allows sharing and reuse. Specifically, we define data contents using relational descriptions in an expressive logic. Second, we develop a novel planner that uses relational subsumption to connect the output of a data processing operation with the input of another. Our modeling methodology has the significant advantage that the planner can automatically insert adaptor operations wherever necessary to bridge the inputs and outputs of operations in the workflow. We have implemented the approach in a transportation modeling domain.