Script Independent Word Spotting in Offline Handwritten Documents Based on Hidden Markov Models

  • Authors:
  • Safwan Wshah;Gaurav Kumar;Venu Govindaraju

  • Affiliations:
  • -;-;-

  • Venue:
  • ICFHR '12 Proceedings of the 2012 International Conference on Frontiers in Handwriting Recognition
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Keyword spotting aims to retrieve all instances of a given keyword from a document in any language. In this paper, we propose a novel script independent line based word spotting framework for offline handwritten documents based on Hidden Markov Models. The methodology simulates the keywords in model space as a sequence of character models and uses the filler models for better representation of background or non-keyword text. We propose a two stage spotting framework where the candidate keywords are further pruned using the character based background and lexicon based background model. The system deals with large vocabulary without the need for word or character segmentation. The system has been evaluated on many public dataset from several languages such as IAM for English, AMA for Arabic and LAW for Devanagari. The system outperforms the modern line based approach on the English, Arabic and Devanagari Datasets.