Distributed process discovery and conformance checking

  • Authors:
  • Wil M. P. van der Aalst

  • Affiliations:
  • Eindhoven University of Technology, Eindhoven, The Netherlands and Queensland University of Technology, Brisbane, Australia

  • Venue:
  • FASE'12 Proceedings of the 15th international conference on Fundamental Approaches to Software Engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Process mining techniques have matured over the last decade and more and more organization started to use this new technology. The two most important types of process mining are process discovery (i.e., learning a process model from example behavior recorded in an event log) and conformance checking (i.e., comparing modeled behavior with observed behavior). Process mining is motivated by the availability of event data. However, as event logs become larger (say terabytes), performance becomes a concern. The only way to handle larger applications while ensuring acceptable response times, is to distribute analysis over a network of computers (e.g., multicore systems, grids, and clouds). This paper provides an overview of the different ways in which process mining problems can be distributed. We identify three types of distribution: replication, a horizontal partitioning of the event log, and a vertical partitioning of the event log. These types are discussed in the context of both procedural (e.g., Petri nets) and declarative process models. Most challenging is the horizontal partitioning of event logs in the context of procedural models. Therefore, a new approach to decompose Petri nets and associated event logs is presented. This approach illustrates that process mining problems can be distributed in various ways.