G-ANMI: A mutual information based genetic clustering algorithm for categorical data

  • Authors:
  • Shengchun Deng;Zengyou He;Xiaofei Xu

  • Affiliations:
  • Department of Computer Science and Engineering, Harbin Institute of Technology, Harbin, China;Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China;Department of Computer Science and Engineering, Harbin Institute of Technology, Harbin, China

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Identification of meaningful clusters from categorical data is one key problem in data mining. Recently, Average Normalized Mutual Information (ANMI) has been used to define categorical data clustering as an optimization problem. To find globally optimal or near-optimal partition determined by ANMI, a genetic clustering algorithm (G-ANMI) is proposed in this paper. Experimental results show that G-ANMI is superior or comparable to existing algorithms for clustering categorical data in terms of clustering accuracy.