Discovering process models with genetic algorithms using sampling

  • Authors:
  • Carmen Bratosin;Natalia Sidorova;Wil van der Aalst

  • Affiliations:
  • Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, The Netherlands;Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, The Netherlands;Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, The Netherlands

  • Venue:
  • KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Process mining, a new business intelligence area, aims at discovering process models from event logs. Complex constructs, noise and infrequent behavior are issues that make process mining a complex problem. A genetic mining algorithm, which applies genetic operators to search in the space of all possible process models, deals with the aforementioned challenges with success. Its drawback is high computation time due to the high time costs of the fitness evaluation. Fitness evaluation time linearly depends on the number of process instances in the log. By using a sampling-based approach, i.e. evaluating fitness on a sample from the log instead of the whole log, we drastically reduce the computation time. When the desired fitness is achieved on the sample, we check the fitness on the whole log; if it is not achieved yet, we increase the sample size and continue the computation iteratively. Our experiments show that sampling works well even for relatively small logs, and the total computation time is reduced by 6 up to 15 times.