Cleopatra: evolutionary pattern-based clustering of web usage data

  • Authors:
  • Qiankun Zhao;Sourav S. Bhowmick;Le Gruenwald

  • Affiliations:
  • CAIS, Nanyang Technological University, Singapore;CAIS, Nanyang Technological University, Singapore;University of Oklahoma, Norman

  • Venue:
  • PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Existing web usage mining techniques focus only on discovering knowledge based on the statistical measures obtained from the static characteristics of web usage data. They do not consider the dynamic nature of web usage data. In this paper, we present an algorithm called Cleopatra (CLustering of EvOlutionary PAtTeRn-based web Access sequences) to cluster web access sequences $\mathcal{(WAS)}s$ based on their evolutionary patterns. In this approach, Web access sequences that have similar change patterns in their support counts in the history are grouped into the same cluster. The intuition is that often $\mathcal{WAS}s$ are event/task-driven. As a result, $\mathcal{WAS}s$ related to the same event/task are expected to be accessed in similar ways over time. Such clusters are useful for several applications such as intelligent web site maintenance and personalized web services.