A comparative study on text clustering methods

  • Authors:
  • Yan Zheng;Xiaochun Cheng;Ronghuai Huang;Yi Man

  • Affiliations:
  • School of Computer Science and Technology, Beijing University of Posts and Telecommunications, Beijing, China;Knowledge Science and Engineering Institute, Beijing Normal University, Beijing, China;Knowledge Science and Engineering Institute, Beijing Normal University, Beijing, China;School of Computer Science and Technology, Beijing University of Posts and Telecommunications, Beijing, China

  • Venue:
  • ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Text clustering is one of the most important research areas in text mining, which handles the text automatically to discover implicit knowledge. It groups text into different clusters by contents without apriori knowledge. In this paper, different text clustering methods are studied and three text clustering validation criteria are studied and used to evaluate the experimental results. We compare and contrast the effectiveness of k-means and FIHC text clustering methods by experiments, and address the different levels of quality of the resulting text clusters.