Fast detection of exact clones in business process model repositories

  • Authors:
  • Marlon Dumas;Luciano GarcíA-BañUelos;Marcello La Rosa;Reina Uba

  • Affiliations:
  • University of Tartu, Tartu, Estonia;University of Tartu, Tartu, Estonia;Queensland University of Technology, Brisbane, Australia and NICTA Queensland Lab, Brisbane, Australia;University of Tartu, Tartu, Estonia

  • Venue:
  • Information Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

As organizations reach higher levels of business process management maturity, they often find themselves maintaining very large process model repositories, representing valuable knowledge about their operations. A common practice within these repositories is to create new process models, or extend existing ones, by copying and merging fragments from other models. We contend that if these duplicate fragments, a.k.a. exact clones, can be identified and factored out as shared subprocesses, the repository's maintainability can be greatly improved. With this purpose in mind, we propose an indexing structure to support fast detection of clones in process model repositories. Moreover, we show how this index can be used to efficiently query a process model repository for fragments. This index, called RPSDAG, is based on a novel combination of a method for process model decomposition (namely the Refined Process Structure Tree), with established graph canonization and string matching techniques. We evaluated the RPSDAG with large process model repositories from industrial practice. The experiments show that a significant number of non-trivial clones can be efficiently found in such repositories, and that fragment queries can be handled efficiently.