An Improved Partitioning-Based Web Documents Clustering Method Combining GA with ISODATA

  • Authors:
  • Zhengyu Zhu;Yunyan Tian;Jingqiu Xu;Xin Deng;Xiang Ren

  • Affiliations:
  • Chongqing University, Chongqing 400044, China;Chongqing University, Chongqing 400044, China;Chongqing University, Chongqing 400044, China;Chongqing University, Chongqing 400044, China;Chongqing University, Chongqing 400044, China

  • Venue:
  • FSKD '07 Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 02
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The existing partitioning-based clustering algorithms, such as k-means, k-medoids and their variations, are simple in theory and fast in convergence speed, but they always just reach local optimum when the iterations terminate and they are not suitable for discovering clusters in the cases when their sizes are very different. This paper proposes an improved Web documents clustering method, using genetic algorithm (GA) which introduces some ideas of ISODATA[6] into the design of its mutation operation. Experiments show that the GA's global search characteristic can avoid local optimum and the ISODATA-based mutation operation makes the improved clustering algorithm have the self-adjusting ability to discover clusters of different sizes.