Median strings for k-nearest neighbour classification

Authors:
C. D. Martínez-Hinarejos;A. Juan;F. Casacuberta
Affiliations:
Institut Tecnològic d'Informàtica, Departament de Sistemes Informàtics i Computació, Universitat Politècnica de València, Camí de Vera, s/n, 46071 València, ...;Institut Tecnològic d'Informàtica, Departament de Sistemes Informàtics i Computació, Universitat Politècnica de València, Camí de Vera, s/n, 46071 València, ...;Institut Tecnològic d'Informàtica, Departament de Sistemes Informàtics i Computació, Universitat Politècnica de València, Camí de Vera, s/n, 46071 València, ...
Venue:
Pattern Recognition Letters
Year:
2003

Citing 7
Cited 8

Fast K-means-like clustering in metric spaces

Pattern Recognition Letters
The String-to-String Correction Problem

Journal of the ACM (JACM)
Topology of strings: median string is NP-complete

Theoretical Computer Science
Fast Computation of Normalized Edit Distances

IEEE Transactions on Pattern Analysis and Machine Intelligence
A disagreement count scheme for inference of constrained Markov networks

ICG! '96 Proceedings of the 3rd International Colloquium on Grammatical Inference: Learning Syntax from Sentences
Fast Median Search in Metric Spaces

SSPR '98/SPR '98 Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)

Reducing the Computational Cost of Computing Approximated Median Strings

Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Efficient bottom-up hybrid hierarchical clustering techniques for protein sequence classification

Pattern Recognition
A Stochastic Approach to Median String Computation

SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
A new editing scheme based on a fast two-string median computation applied to OCR

SSPR&SPR'10 Proceedings of the 2010 joint IAPR international conference on Structural, syntactic, and statistical pattern recognition
Characterization of contour regularities based on the Levenshtein edit distance

Pattern Recognition Letters
Flexible method for a distance measure between communicative agents’ stored perceptions

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
An improved fast edit approach for two-string approximated mean computation applied to OCR

Pattern Recognition Letters
A new iterative algorithm for computing a quality approximate median of strings based on edit operations

Pattern Recognition Letters

Quantified Score

Hi-index	0.10

Visualization

Abstract

Modelling a (large) set of garbled patterns with a prototype is an important issue in pattern recognition. When strings are used as object representations, the representative prototype can be a (generalized) median string. The median string of a set of strings can be defined as the string that minimizes the sum of distances to the strings of a given set. The search of such a string is a NP-Hard problem and, therefore, no efficient algorithms to compute the median strings can be designed. Thus, the use of the set median string, which is the string in the set that minimizes the sum of distances to the strings of the set, is very common.Recently, a greedy approach was proposed to compute good approximations to the median string of a set of strings. In this work, the use of approximated median strings with k-nearest-neighbours classifiers is presented.Exhaustive experiments have been carried out on a corpus of chromosomes. These experiments showed that the proposed approximations to the median string are a better representation of a given set than the corresponding set median.