A new iterative algorithm for computing a quality approximate median of strings based on edit operations

Authors:
J. Abreu;J. R. Rico-Juan
Affiliations:
-;-
Venue:
Pattern Recognition Letters
Year:
2014

Citing 16
Cited 0

Representation and Recognition of Handwritten Digits Using Deformable Templates

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning String-Edit Distance

IEEE Transactions on Pattern Analysis and Machine Intelligence
The String-to-String Correction Problem

Journal of the ACM (JACM)
Improved greedy algorithm for computing approximate median strings

Acta Cybernetica
Computer Processing of Line-Drawing Images

ACM Computing Surveys (CSUR)
Median strings for k-nearest neighbour classification

Pattern Recognition Letters
Optimal Lower Bound for Generalized Median Problems in Metric Space

Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Reducing the Computational Cost of Computing Approximated Median Strings

Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
A Learning Model for Multiple-Prototype Classification of Strings

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 4 - Volume 04
Ensemble Methods in the Clustering of String Patterns

WACV-MOTION '05 Proceedings of the Seventh IEEE Workshops on Application of Computer Vision (WACV/MOTION'05) - Volume 1 - Volume 01
A Stochastic Approach to Median String Computation

SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
On the Use of Median String for Multi-source Translation

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
A sum-over-paths extension of edit distances accounting for all sequence alignments

Pattern Recognition
New rank methods for reducing the size of the training set using the nearest neighbor rule

Pattern Recognition Letters
Generalized median string computation by means of string embedding in vector spaces

Pattern Recognition Letters
Median strings

Pattern Recognition Letters

Quantified Score

Hi-index	0.10

Visualization

Abstract

This paper presents a new algorithm that can be used to compute an approximation to the median of a set of strings. The approximate median is obtained through the successive improvements of a partial solution. The edit distance from the partial solution to all the strings in the set is computed in each iteration, thus accounting for the frequency of each of the edit operations in all the positions of the approximate median. A goodness index for edit operations is later computed by multiplying their frequency by the cost. Each operation is tested, starting from that with the highest index, in order to verify whether applying it to the partial solution leads to an improvement. If successful, a new iteration begins from the new approximate median. The algorithm finishes when all the operations have been examined without a better solution being found. Comparative experiments involving Freeman chain codes encoding 2D shapes and the Copenhagen chromosome database show that the quality of the approximate median string is similar to benchmark approaches but achieves a much faster convergence.