An enhanced clusterer aggregation using nebulous pool

  • Authors:
  • R. J. Anandhi;S. Natarajan;Sunita Abburu

  • Affiliations:
  • Dr MGR university, Chennai, India;PESIT, Bangalore, India;Oxford College of Sc., Bangalore

  • Venue:
  • Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cluster Ensembles is a framework for combining multiple partitioning obtained from separate clustering runs into a final consensus clustering. In this paper, we have analyzed that using a layered approach in combining the clusterer outputs can help in reducing the intensive computing and also provide scope for reuse the knowledge gained for further merging. We have discussed our proposed layered cluster merging technique for spatial datasets and used it in our three-phase nebulous pool aggregator in this paper. At the first level, B heterogeneous ensembles are run against the same spatial data set D of size n data points to generate clustering results. A voting matrix of size n X B is first generated from which agreement count is derived. Based of the agreement count and the degree of agreement, partial cluster aggregation is obtained. The aggregated clusters so obtained are scanned for data points with tie problem. Such uncertainly classified data points are then removed from the aggregated clusters and stored in a nebulous pool, which will be merged into our final clusters using maximum overshadow technique. We have also merged vertical sliced clusterings in our homogenous ensemble aggregation, and found better substitute for the missing value in any attribute for a given data point. Cluster validation metrics like cluster accuracy, inter and intra cluster density and error rates have been measured to confirm that our aggregation results are more robust and accurate.