A Novel Approach for Word Spotting Using Merge-Split Edit Distance

  • Authors:
  • Khurram Khurshid;Claudie Faure;Nicole Vincent

  • Affiliations:
  • Laboratoire CRIP5 --- SIP, Université Paris Descartes, Paris, France 75006;UMR CNRS 5141 - GET ENST, Paris Cedex 13, France 75634;Laboratoire CRIP5 --- SIP, Université Paris Descartes, Paris, France 75006

  • Venue:
  • CAIP '09 Proceedings of the 13th International Conference on Computer Analysis of Images and Patterns
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Edit distance matching has been used in literature for word spotting with characters taken as primitives. The recognition rate however, is limited by the segmentation inconsistencies of characters (broken or merged) caused by noisy images or distorted characters. In this paper, we have proposed a Merge-split edit distance which overcomes these segmentation problems by incorporating a multi-purpose merge cost function. The system is based on the extraction of words and characters in the text and then attributing each character with a set of features. Characters are matched by comparing their extracted feature sets using Dynamic Time Warping (DTW) while the words are matched by comparing the strings of characters using the proposed Merge-Split Edit distance algorithm. Evaluation of the method on 19th century historical document images exhibits extremely promising results.