A divergence-oriented approach for web users clustering

  • Authors:
  • Sophia G. Petridou;Vassiliki A. Koutsonikola;Athena I. Vakali;Georgios I. Papadimitriou

  • Affiliations:
  • Dept of Informatics Aristotle University, Thessaloniki, Greece;Dept of Informatics Aristotle University, Thessaloniki, Greece;Dept of Informatics Aristotle University, Thessaloniki, Greece;Dept of Informatics Aristotle University, Thessaloniki, Greece

  • Venue:
  • ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part II
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering web users based on their access patterns is a quite significant task in Web Usage Mining. Further to clustering it is important to evaluate the resulted clusters in order to choose the best clustering for a particular framework. This paper examines the usage of Kullback-Leibler divergence, an information theoretic distance, in conjuction with the k-means clustering algorithm. It compares KL-divergence with other well known distance measures (Euclidean, Standardized Euclidean and Manhattan) and evaluates clustering results using both objective function’s value and Davies-Bouldin index. Since it is imperative to assess whether the results of a clustering process are susceptible to noise, especially in noisy environments such as Web environment, our approach takes the impact of noise into account. The clusters obtained with KL approach seem to be superior to those obtained with the other distance measures in case our data have been corrupted by noise.