Scalable data clustering: a sammon's projection based technique for merging GSOMs

  • Authors:
  • Hiran Ganegedara;Damminda Alahakoon

  • Affiliations:
  • Cognitive and Connectionist Systems Laboratory, Faculty of Information Technology, Monash University, Australia;Cognitive and Connectionist Systems Laboratory, Faculty of Information Technology, Monash University, Australia

  • Venue:
  • ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Self-Organizing Map (SOM) and Growing Self-Organizing Map (GSOM) are widely used techniques for exploratory data analysis. The key desirable features of these techniques are applicability to real world data sets and the ability to visualize high dimensional data in low dimensional output space. One of the core problems of using SOM/GSOM based techniques on large datasets is the high processing time requirement. A possible solution is the generation of multiple maps for subsets of data where the subsets consist of the entire dataset. However the advantage of topographic organization of a single map is lost in the above process. This paper proposes a new technique where Sammon's projection is used to merge an array of GSOMs generated on subsets of a large dataset. We demonstrate that the accuracy of clustering is preserved after the merging process. This technique utilizes the advantages of parallel computing resources.