Hierarchic document classification using Ward's clustering method

  • Authors:
  • A. El-Hamdouchi;P. Willett

  • Affiliations:
  • Sheffield University, Western Bank, Sheffield, S10 2TN, UK;Sheffield University, Western Bank, Sheffield, S10 2TN, UK

  • Venue:
  • Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 1986

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we discuss the application of a recent hierarchic clustering algorithm to the automatic classification of files of documents. Whereas most hierarchic clustering algorithms involve the generation and updating of an inter-object dissimilarity matrix, this new algorithm is based upon a series of nearest neighbor searches. Such an approach is appropriate to several clustering methods, including Ward's method which has been shown to perform well in experimental studies of hierarchic document clustering. A description is given of heuristics which can increase the efficiency of the new algorithm when it is used to cluster three document collections by Ward's method.