Pipelining of Fuzzy ARTMAP without matchtracking: Correctness, performance bound, and Beowulf evaluation

  • Authors:
  • José Castro;Jimmy Secretan;Michael Georgiopoulos;Ronald DeMara;Georgios Anagnostopoulos;Avelino Gonzalez

  • Affiliations:
  • Department of Computer Engineering, Technological Institute of Costa Rica, Cartago, Costa Rica;Department of Electrical and Computer Engineering, University of Central Florida, Orlando, FL 32816, United States;Department of Electrical and Computer Engineering, University of Central Florida, Orlando, FL 32816, United States;Department of Electrical and Computer Engineering, University of Central Florida, Orlando, FL 32816, United States;Department of Electrical and Computer Engineering, Florida Institute of Technology, Melbourne, FL 32901, United States;Department of Electrical and Computer Engineering, University of Central Florida, Orlando, FL 32816-2786, United States

  • Venue:
  • Neural Networks
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

Fuzzy ARTMAP neural networks have been proven to be good classifiers on a variety of classification problems. However, the time that Fuzzy ARTMAP takes to converge to a solution increases rapidly as the number of patterns used for training is increased. In this paper we examine the time Fuzzy ARTMAP takes to converge to a solution and we propose a coarse grain parallelization technique, based on a pipeline approach, to speed-up the training process. In particular, we have parallelized Fuzzy ARTMAP without the match-tracking mechanism. We provide a series of theorems and associated proofs that show the characteristics of Fuzzy ARTMAP's, without matchtracking, parallel implementation. Results run on a BEOWULF cluster with three large databases show linear speedup as a function of the number of processors used in the pipeline. The databases used for our experiments are the Forrest CoverType database from the UCI Machine Learning repository and two artificial databases, where the data generated were 16-dimensional Gaussian distributed data belonging to two distinct classes, with different amounts of overlap (5% and 15%).