A Scalable Approach to Balanced, High-Dimensional Clustering of Market-Baskets

  • Authors:
  • Alexander Strehl;Joydeep Ghosh

  • Affiliations:
  • -;-

  • Venue:
  • HiPC '00 Proceedings of the 7th International Conference on High Performance Computing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents Opossum, a novel similarity-based clustering approach based on constrained, weighted graph-partitioning. Opossum is particularly attuned to real-life market baskets, characterized by very high-dimensional, highly sparse customer-product matrices with positive ordinal attribute values and significant amount of outliers. Since it is built on top of Metis, a well-known and highly efficient graphpartitioning algorithm, it inherits the scalable and easily parallelizeable attributes of the latter algorithm. Results are presented on a real retail industry data-set of several thousand customers and products, with the help of Clusion, a cluster visualization tool.