Parallel nearest neighbour algorithms for text categorization

  • Authors:
  • Reynaldo Gil-García;José Manuel Badía-Contelles;Aurora Pons-Porrata

  • Affiliations:
  • Center of Pattern Recognition and Data Mining, Universidad de Oriente, Cuba;Dpt. Computer Science and Engineering, Universitat Jaume I, Castellón, Spain;Center of Pattern Recognition and Data Mining, Universidad de Oriente, Cuba

  • Venue:
  • Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we describe the parallelization of two nearest neighbour classification algorithms. Nearest neighbour methods are well-known machine learning techniques. They have been successfully applied to Text Categorization task. Based on standard parallel techniques we propose two versions of each algorithm on message passing architectures. We also include experimental results on a cluster of personal computers using a large text collection. Our algorithms attempt to balance the load among the processors, they are portable, and obtain very good speedups and scalability.