Rough clustering of sequential data

  • Authors:
  • Pradeep Kumar;P. Radha Krishna;Raju. S. Bapi;Supriya Kumar De

  • Affiliations:
  • Business Intelligence Lab, Institute for Development and Research in Banking Technology (IDRBT), 1, Castle Hills, Masab Tank, Hyderabad 500057, India and Computational Intelligence Lab, Department ...;Business Intelligence Lab, Institute for Development and Research in Banking Technology (IDRBT), 1, Castle Hills, Masab Tank, Hyderabad 500057, India;Computational Intelligence Lab, Department of Computer and Information Sciences, University of Hyderabad, Gachibowli, Hyderabad 500046, India;XLRI Jamshedpur, C.H. Area, Jamshedpur 831001, India

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a new indiscernibility-based rough agglomerative hierarchical clustering algorithm for sequential data. In this approach, the indiscernibility relation has been extended to a tolerance relation with the transitivity property being relaxed. Initial clusters are formed using a similarity upper approximation. Subsequent clusters are formed using the concept of constrained-similarity upper approximation wherein a condition of relative similarity is used as a merging criterion. We report results of experimentation on msnbc web navigation dataset that are intrinsically sequential in nature. We have compared the results of the proposed approach with that of the traditional hierarchical clustering algorithm using vector coding of sequences. The results establish the viability of the proposed approach. The rough clusters resulting from the proposed algorithm provide interpretations of different navigation orientations of users present in the sessions without having to fit each object into only one group. Such descriptions can help web miners to identify potential and meaningful groups of users.