A First Study on Clustering Collections of Workflow Graphs

  • Authors:
  • Emanuele Santos;Lauro Lins;James P. Ahrens;Juliana Freire;Cláudio T. Silva

  • Affiliations:
  • Scientific Computing and Imaging Institute, University of Utah,;Scientific Computing and Imaging Institute, University of Utah,;Los Alamos National Lab,;School of Computing, University of Utah,;Scientific Computing and Imaging Institute, University of Utah, and School of Computing, University of Utah,

  • Venue:
  • Provenance and Annotation of Data and Processes
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

As workflow systems get more widely used, the number of workflows and the volume of provenance they generate has grown considerably. New tools and infrastructure are needed to allow users to interact with, reason about, and re-use this information. In this paper, we explore the use of clustering techniques to organize large collections of workflow and provenance graphs. We propose two different representations for these graphs and present an experimental evaluation, using a collection of 1,700 workflow graphs, where we study the trade-offs of these representations and the effectiveness of alternative clustering techniques.