Input Noise Robustness and Sensitivity Analysis to Improve Large Datasets Clustering by Using the GRID

  • Authors:
  • Alberto Faro;Daniela Giordano;Francesco Maiorana

  • Affiliations:
  • Dipartimento di Ingegneria Informatica e Telecomunicazioni, University of Catania, Catania, Italy 95125;Dipartimento di Ingegneria Informatica e Telecomunicazioni, University of Catania, Catania, Italy 95125;Dipartimento di Ingegneria Informatica e Telecomunicazioni, University of Catania, Catania, Italy 95125

  • Venue:
  • DS '08 Proceedings of the 11th International Conference on Discovery Science
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we investigate the performance of a refined version of the Kohonen self organizing feature maps algorithm in terms of classification correctness when we inject in a sparse input matrix different kinds of noise and compared these classification results with the one without noise. The analysis not only gives indications on the classification errors due to noisy data, but also let a methodology to emerge in order to identify the portion of the input matrix that must be controlled with great care for avoiding classification errors. The methodology also suggests a suitable data partitioning approach for a GRID implementation of the described algorithm. The methodological indications were successfully verified by a case study belonging to the bioinformatics field.