Effect of feature-type in selecting distance measure for an artificial immune system as a pattern recognizer

  • Authors:
  • Seral Özşen;Salih Güneş

  • Affiliations:
  • Electrical and Electronics Engineering Department, Selcuk University, 42035 Konya, Turkey;Electrical and Electronics Engineering Department, Selcuk University, 42035 Konya, Turkey

  • Venue:
  • Digital Signal Processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In designing an artificial immune system (AIS) for a problem domain, one must select a distance measure to find the affinity between system units and input data after determining a representation type. Euclidean distance is a commonly used distance measure in many proposed methods and is selected intuitively or due to simplicity of implementation. But this selection must be done carefully by considering the properties of problem domain. For example, most problems use data vectors with discrete, real-valued and nominal feature values. Whereas Euclidean distance can be used in this kind of problems, some other similarity measures designed for these hybrid vectors would give better results. To call attention of AIS designer to this point, we have tested three distance criteria which are Euclidean distance, Manhattan distance, and hybrid similarity measure on a simple AIS for the classification of two medical dataset taken from the UCI machine learning repository. One of the datasets, Statlog heart disease, contains nominal, discrete and real-valued vectors while the other one, BUPA liver disorders dataset, consists of purely real-valued vectors. For Statlog dataset, the best classification result was obtained with hybrid similarity measure as expected because this dataset consists of three-types of features while results for BUPA dataset were not different so much for the used measures, which is also an expected result considering the nature of this dataset.