Business process mining from e-commerce web logs

  • Authors:
  • Nicolas Poggi;Vinod Muthusamy;David Carrera;Rania Khalaf

  • Affiliations:
  • Technical University of Catalonia (UPC), Barcelona, Spain,Barcelona Supercomputing Center (BSC), Barcelona, Spain;IBM T. J. Watson Research Center, Yorktown, New York;Technical University of Catalonia (UPC), Barcelona, Spain,Barcelona Supercomputing Center (BSC), Barcelona, Spain;IBM T. J. Watson Research Center, Yorktown, New York

  • Venue:
  • BPM'13 Proceedings of the 11th international conference on Business Process Management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The dynamic nature of the Web and its increasing importance as an economic platform create the need of new methods and tools for business efficiency. Current Web analytic tools do not provide the necessary abstracted view of the underlying customer processes and critical paths of site visitor behavior. Such information can offer insights for businesses to react effectively and efficiently. We propose applying Business Process Management (BPM) methodologies to e-commerce Website logs, and present the challenges, results and potential benefits of such an approach. We use the Business Process Insight (BPI) platform, a collaborative process intelligence toolset that implements the discovery of loosely-coupled processes, and includes novel process mining techniques suitable for the Web. Experiments are performed on custom click-stream logs from a large online travel and booking agency. We first compare Web clicks and BPM events, and then present a methodology to classify and transform URLs into events. We evaluate traditional and custom process mining algorithms to extract business models from real-life Web data. The resulting models present an abstracted view of the relation between pages, exit points, and critical paths taken by customers. Such models show important improvements and aid high-level decision making and optimization of e-commerce sites compared to current state-of-art Web analytics.