A Study of Quality and Accuracy Trade-offs in Process Mining

Authors:
Zan Huang;Akhil Kumar
Affiliations:
Smeal College of Business, Pennsylvania State University, University Park, Pennsylvania 16802;Smeal College of Business, Pennsylvania State University, University Park, Pennsylvania 16802
Venue:
INFORMS Journal on Computing
Year:
2012

Citing 30
Cited 0

Understanding Quality in Conceptual Modeling

IEEE Software
SAP R/3 business blueprint: understanding the business process reference model

SAP R/3 business blueprint: understanding the business process reference model
Workflow Patterns

Distributed and Parallel Databases
Mining Process Models from Workflow Logs

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Generic Linear Business Process Modeling

ER '00 Proceedings of the Workshops on Conceptual Modeling Approaches for E-Business and The World Wide Web and Conceptual Modeling: Conceptual Modeling for E-Business and the Web
Integrating Machine Learning and Workflow Management to Support Acquisition and Adaptation of Workflow Models

DEXA '98 Proceedings of the 9th International Workshop on Database and Expert Systems Applications
Workflow mining: a survey of issues and approaches

Data & Knowledge Engineering
Toward High-Precision Service Retrieval

IEEE Internet Computing
Workflow Mining: Discovering Process Models from Event Logs

IEEE Transactions on Knowledge and Data Engineering
Probabilistic workflow mining

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)
Discovering Social Networks from Event Logs

Computer Supported Cooperative Work
Mining of ad-hoc business processes with TeamLog

Data & Knowledge Engineering
A Rule-Based Approach for Process Discovery: Dealing with Noise and Imbalance in Process Logs

Data Mining and Knowledge Discovery
Discovering Expressive Process Models by Clustering Log Traces

IEEE Transactions on Knowledge and Data Engineering
Genetic process mining: an experimental evaluation

Data Mining and Knowledge Discovery
Business process mining: An industrial application

Information Systems
Rediscovering workflow models from event-based data using little thumb

Integrated Computer-Aided Engineering
Conformance checking of processes based on monitoring real behavior

Information Systems
Process Discovery Using Integer Linear Programming

PETRI NETS '08 Proceedings of the 29th international conference on Applications and Theory of Petri Nets
Process Mining: Overview and Outlook of Petri Net Discovery Algorithms

Transactions on Petri Nets and Other Models of Concurrency II
Mining most specific workflow models from event-based data

BPM'03 Proceedings of the 2003 international conference on Business process management
ProM 4.0: comprehensive support for real process analysis

ICATPN'07 Proceedings of the 28th international conference on Applications and theory of Petri nets and other models of concurrency
What makes process models understandable?

BPM'07 Proceedings of the 5th international conference on Business process management
Fuzzy mining: adaptive process simplification based on multi-perspective metrics

BPM'07 Proceedings of the 5th international conference on Business process management
Process mining based on clustering: a quest for precision

BPM'07 Proceedings of the 2007 international conference on Business process management
The need for a process mining evaluation framework in research and practice: position paper

BPM'07 Proceedings of the 2007 international conference on Business process management
Detecting implicit dependencies between tasks from event logs

APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Business process design by view integration

BPM'06 Proceedings of the 2006 international conference on Business Process Management Workshops
The prom framework: a new era in process mining tool support

ICATPN'05 Proceedings of the 26th international conference on Applications and Theory of Petri Nets

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, many algorithms have been proposed to extract process models from process execution logs. The process models describe the ordering relationships between tasks in a process in terms of standard constructs like sequence, parallel, choice, and loop. Most algorithms assume that each trace in a log represents a correct execution sequence based on a model. In practice, logs are often noisy, and algorithms designed for correct logs are not able to handle noisy logs. In this paper we share our key insights from a study of noise in process logs both real and synthetic. We found that all process logs can be explained by a block-structured model with two special self-loop and optional structures, making it trivial to build a fully accurate process model for any given log, even one with inaccurate data or noise present in it. However, such a model suffers from low quality. By controlling the use of self-loop and optional structures of tasks and blocks of tasks, we can balance the quality and accuracy trade-off to derive high-quality process models that explain a given percentage of traces in the log. Finally, new quality metrics and a novel quality-based algorithm for model extraction from noisy logs are described. The results of the experiments with the algorithm on real and synthetic data are reported and analyzed at length.