Cleopatra: evolutionary pattern-based clustering of web usage data

Authors:
Qiankun Zhao;Sourav S. Bhowmick;Le Gruenwald
Affiliations:
CAIS, Nanyang Technological University, Singapore;CAIS, Nanyang Technological University, Singapore;University of Oklahoma, Norman
Venue:
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Year:
2006

Citing 7
Cited 0

Efficient Data Mining for Path Traversal Patterns

IEEE Transactions on Knowledge and Data Engineering
Center CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Mining Access Patterns Efficiently from Web Logs

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Web usage mining: discovery and applications of usage patterns from Web data

ACM SIGKDD Explorations Newsletter
A Web page prediction model based on click-stream tree representation of user behavior

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Web usage mining based on probabilistic latent semantic analysis

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
WAM-Miner: in the search of web access motifs from historical web log data

Proceedings of the 14th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Existing web usage mining techniques focus only on discovering knowledge based on the statistical measures obtained from the static characteristics of web usage data. They do not consider the dynamic nature of web usage data. In this paper, we present an algorithm called Cleopatra (CLustering of EvOlutionary PAtTeRn-based web Access sequences) to cluster web access sequences $\mathcal{(WAS)}s$ based on their evolutionary patterns. In this approach, Web access sequences that have similar change patterns in their support counts in the history are grouped into the same cluster. The intuition is that often $\mathcal{WAS}s$ are event/task-driven. As a result, $\mathcal{WAS}s$ related to the same event/task are expected to be accessed in similar ways over time. Such clusters are useful for several applications such as intelligent web site maintenance and personalized web services.