Towards efficient business process clustering and retrieval: combining language modeling and structure matching

  • Authors:
  • Mu Qiao;Rama Akkiraju;Aubrey J. Rembert

  • Affiliations:
  • Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA;IBM T.J. Watson Research Center, Hawthorne, NY;IBM T.J. Watson Research Center, Hawthorne, NY

  • Venue:
  • BPM'11 Proceedings of the 9th international conference on Business process management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large organizations tend to have hundreds of business processes. Discovering and understanding similarities among business processes can be useful to organizations for a number of reasons including better overall process management and maintenance. In this paper we present a novel and efficient approach to cluster and retrieve business processes. A given set of business processes are clustered based on their underlying topic, structure and semantic similarities. In addition, given a query business process, top k most similar processes are retrieved based on clustering results. In this work, we bring together two not wellconnected schools of work: statistical language modeling and structure matching and combine them in a novel way. Our approach takes into account both high-level topic information that can be collected from process description documents and keywords as well as detailed structural features such as process control flows in finding similarities among business processes. This ability to work with processes that may not always have formal control flows is particularly useful in dealing with real-world business processes which are not always described formally. We developed a system to implement our approach and evaluated it on several collections of industry best practice processes and real-world business processes at a large IT service company that are described at varied levels of formalisms. Our experimental results reveal that the combined language modeling and structure matching based retrieval outperforms structure-matching-only techniques in both mean average precision and running time measures.