Golden Path Analyzer: using divide-and-conquer to cluster Web clickstreams

  • Authors:
  • Kamal Ali;Steven P. Ketchpel

  • Affiliations:
  • San Mateo, CA;San Mateo, CA

  • Venue:
  • Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a novel algorithm and deployed system Golden Path Analyzer (GPA) that analyzes clickstreams of people trying to complete the same task on a website. It finds the shortest, successful paths taken by users - 'golden paths' - and uses these as seeds for clickstream clusters. Other users are assigned to a cluster if their clickstream is a supersequence of the golden path. The advantages of this approach are that the resulting clusters are easily comprehended, they are few in number, correspond to semantically different strategies used by the users, and jointly partition all the clickstreams. GPA's key contribution over prior work in process funnels is that by not excluding users that make diversions from the golden path, GPA is able to assign more users to fewer clusters. Another key contribution is to use actual full clickstreams as cluster seeds to which supersequences of other users are added. Golden paths correspond to complete clickstreams that are based on actual user page transitions. GPA is particularly useful for site designers to improve processes such as shopping, returns and registration. Its analyses identify which web pages cause many users to deviate from a golden path, which links distract users and the percentage of users taking each golden path. GPA has demonstrated value on more than twenty client projects in diverse industries.