Median strings for k-nearest neighbour classification

  • Authors:
  • C. D. Martínez-Hinarejos;A. Juan;F. Casacuberta

  • Affiliations:
  • Institut Tecnològic d'Informàtica, Departament de Sistemes Informàtics i Computació, Universitat Politècnica de València, Camí de Vera, s/n, 46071 València, ...;Institut Tecnològic d'Informàtica, Departament de Sistemes Informàtics i Computació, Universitat Politècnica de València, Camí de Vera, s/n, 46071 València, ...;Institut Tecnològic d'Informàtica, Departament de Sistemes Informàtics i Computació, Universitat Politècnica de València, Camí de Vera, s/n, 46071 València, ...

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2003

Quantified Score

Hi-index 0.10

Visualization

Abstract

Modelling a (large) set of garbled patterns with a prototype is an important issue in pattern recognition. When strings are used as object representations, the representative prototype can be a (generalized) median string. The median string of a set of strings can be defined as the string that minimizes the sum of distances to the strings of a given set. The search of such a string is a NP-Hard problem and, therefore, no efficient algorithms to compute the median strings can be designed. Thus, the use of the set median string, which is the string in the set that minimizes the sum of distances to the strings of the set, is very common.Recently, a greedy approach was proposed to compute good approximations to the median string of a set of strings. In this work, the use of approximated median strings with k-nearest-neighbours classifiers is presented.Exhaustive experiments have been carried out on a corpus of chromosomes. These experiments showed that the proposed approximations to the median string are a better representation of a given set than the corresponding set median.