Statistical script independent word spotting in offline handwritten documents
Pattern Recognition
Hi-index | 0.00 |
A dataset containing 26,720 handwritten legal amount words written in Hindi and Marathi languages (Devanagari script) is presented in this paper along with a training-free technique to recognize such handwritten legal amounts present on Indian bank cheques. The recognition of handwritten legal amount words in Hindi and Marathi languages is a challenging because of the similar size and shape of many words in the lexicon. Moreover, many words have same suffixes or prefixes. The recognition technique proposed is a combination of two approaches. The first approach is based on gradient, structural and cavity (GSC) features along with a binary vector matching (BVM) technique. The second approach is based on vertical projection profile (VPP) feature and dynamic time warping (DTW). A number of highly matched words in both the approaches are considered for the recognition step in the combined approach based on a ranking scheme. Syntactical knowledge related to the languages is also used to achieve higher reliability. To the best of our knowledge, this is the first work of its kind in recognizing handwritten legal amounts written in Hindi and Marathi. Researchers interested in the dataset can contact the authors to get it through a shared link.