Document clustering: an optimization problem

  • Authors:
  • Ao Feng

  • Affiliations:
  • UMass Amherst, Amherst, MA

  • Venue:
  • SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering algorithms have been widely used in information retrieval applications. However, it is difficult to define an objective "best" result. This article analyzes some document clustering algorithms and illustrates that they are equivalent to the optimization problem of some global functions. Experiments show their good performance, but there are still counter-examples where they fail to return the optimal solution. We argue that Monte-Carlo algorithms in the global optimization framework have the potential to find better solutions than traditional clustering, and they are able to handle more complex structures.