One optimized choosing method of k-means document clustering center

  • Authors:
  • Hongguang Suo;Kunming Nie;Xin Sun;Yuwei Wang

  • Affiliations:
  • School of Computer and Communication Engineering, China University of Petroleum, Dongying, China;School of Computer and Communication Engineering, China University of Petroleum, Dongying, China;School of Computer and Communication Engineering, China University of Petroleum, Dongying, China;School of Computer and Communication Engineering, China University of Petroleum, Dongying, China

  • Venue:
  • AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

A center choice method based on sub-graph division is presented. After constructing the similarity matrix, the disconnected graphs can be established taking the text node as the vertex of the graph and then it will be analyzed. The number of the clustering center and the clustering center can be confirmed automatically on the error allowable range by this method. The noise data can be eliminated effectively in the process of finding clustering center. The experiment results of the two documents show that this method is effective. Compared with the tradition methods, F-Measure is increased by 8%.