Non-hierarchical document clustering using the ICL distribution array processor

  • Authors:
  • E. Rasmussen;P. Willett

  • Affiliations:
  • Dept. of Information Studies, University of Sheffield, Western Bnak, Sheffield S10 2TN, U.K.;Dept. of Information Studies, University of Sheffield, Western Bnak, Sheffield S10 2TN, U.K.

  • Venue:
  • SIGIR '87 Proceedings of the 10th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 1987

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper considers the suitability and efficiency of a highly parallel computer, the ICL Distributed Array Processor (DAP), for document clustering. Algorithms are described for the implementation of the single-pass and reallocation clustering methods on the DAP and on a conventional mainframe computer. These methods are used to classify the Cranfield, Vaswani and UKCIS document test collections. The results suggest that the parallel architecture of the DAP is not well suited to the variable-length records which characterise bibliographic data.