An enhanced clusterer aggregation using nebulous pool

Authors:
R. J. Anandhi;S. Natarajan;Sunita Abburu
Affiliations:
Dr MGR university, Chennai, India;PESIT, Bangalore, India;Oxford College of Sc., Bangalore
Venue:
Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
Year:
2010

Citing 7
Cited 0

Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Cluster ensembles: a knowledge reuse framework for combining partitionings

Eighteenth national conference on Artificial intelligence
Data Clustering Using Evidence Accumulation

ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 4 - Volume 4
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Integrating Microarray Data by Consensus Clustering

ICTAI '03 Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence
Fuzzy Clustering Ensemble Based on Dual Boosting

FSKD '07 Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 02
A Novel Clusterer Ensemble Algorithm Based on Dynamic Cooperation

FSKD '08 Proceedings of the 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 01

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cluster Ensembles is a framework for combining multiple partitioning obtained from separate clustering runs into a final consensus clustering. In this paper, we have analyzed that using a layered approach in combining the clusterer outputs can help in reducing the intensive computing and also provide scope for reuse the knowledge gained for further merging. We have discussed our proposed layered cluster merging technique for spatial datasets and used it in our three-phase nebulous pool aggregator in this paper. At the first level, B heterogeneous ensembles are run against the same spatial data set D of size n data points to generate clustering results. A voting matrix of size n X B is first generated from which agreement count is derived. Based of the agreement count and the degree of agreement, partial cluster aggregation is obtained. The aggregated clusters so obtained are scanned for data points with tie problem. Such uncertainly classified data points are then removed from the aggregated clusters and stored in a nebulous pool, which will be merged into our final clusters using maximum overshadow technique. We have also merged vertical sliced clusterings in our homogenous ensemble aggregation, and found better substitute for the missing value in any attribute for a given data point. Cluster validation metrics like cluster accuracy, inter and intra cluster density and error rates have been measured to confirm that our aggregation results are more robust and accurate.