Similarity-based clustering of Web transactions

  • Authors:
  • Giuseppe Manco;Riccardo Ortale;Domenico Saccà

  • Affiliations:
  • ICAR-CNR, Rende (CS), Italy;University of Calabria, Rende (CS), Italy;University of Calabria, Rende (CS), Italy

  • Venue:
  • Proceedings of the 2003 ACM symposium on Applied computing
  • Year:
  • 2003

Quantified Score

Hi-index 0.03

Visualization

Abstract

We introduce a measure to compute similarity between two sequences containing accesses to Web pages, to be exploited in a clustering approach for grouping sessions of accesses to a Web site. The notion of sequence similarity is parametric to the sequence topology, and the similarity among Web pages within the sequences. In our formalization, two Web pages are similar if they can be considered synonymies not only from a content point of view, but also from a usage point of view, i.e., if users exhibit the same behavior on both pages. The refined notion of page similarity, as well as the related notion of sequence siilarity, are envisaged to be effective in the application of a centroid-based clustering technique to the personalization of Web experience.