A clustering ensemble based on a modified normalized mutual information metric

  • Authors:
  • Hamid Parvin;Behzad Maleki;Sajad Parvin

  • Affiliations:
  • Islamic Azad University, Iran;Islamic Azad University, Iran;Islamic Azad University, Iran

  • Venue:
  • AMT'12 Proceedings of the 8th international conference on Active Media Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

It has been proved that ensemble learning is a solid approach to reach more accurate, stable, robust, and novel results in all data mining tasks such as clustering, classification, regression and etc. Clustering ensemble as a sub-field of ensemble learning is a general approach to improve the performance of clustering task. In this paper by defining a new criterion for clusters validation named Modified Normalized Mutual Information (MNMI), a clustering ensemble framework is proposed. In the framework first a large number of clusters are prepared and then some of them are selected for the final ensemble. The clusters which satisfy a threshold of the proposed metric are selected to participate in final clustering ensemble. For combining the chosen clusters, a co-association based consensus function is applied. Since the Evidence Accumulation Clustering (EAC) method can't derive the co-association matrix from a subset of clusters, Extended Evidence Accumulation Clustering (EEAC), is applied for constructing the co-association matrix from the subset of clusters. Employing this new cluster validation criterion, the obtained ensemble is evaluated on some well-known and standard datasets. The empirical studies show promising results for the ensemble obtained using the proposed criterion comparing with the ensemble obtained using the standard clusters validation criterion.