A practical approach for clustering transaction data

  • Authors:
  • Mohamed Bouguessa

  • Affiliations:
  • Université du Québec en Outaouais, Département d'informatique et d'ingénierie, Quebec, Canada

  • Venue:
  • MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we consider the problem of clustering transaction data. Most of existing transactional clustering algorithms encounter difficulties in the presence of overlapping clusters with a large number outlier items that do not contribute to formation of clusters. Furthermore, the vast majority of existing approaches are dependent on multiple parameters which may be difficult to tune, especially in real-life applications. To these problems, we propose a parameter-free transactional clustering algorithm. Our algorithm first scans the data set in a sequential manner such that the destination of the next transaction is guided by a novel objective function. Once the first scan of the data set is completed, the algorithm performs a few other passes over the data set in order to refine the clustering. The proposed algorithm is able to automatically identify clusters in the presence of large number of outlier items in the data set without any parameters setting by the user. The suitability of our proposal has been demonstrated through an empirical study using synthetic and real data sets.