Keyword Spotting in Offline Chinese Handwritten Documents Using a Statistical Model

  • Authors:
  • Liang Huang;Fei Yin;Qing-Hu Chen;Cheng-Lin Liu

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a method for keyword spotting in offline Chinese handwritten documents using a statistical model. On a text query word, the method measures the similarity between the query word and every candidate word in the document by combining a character classifier and four classifiers characterizing the geometric contexts. By over-segmenting text lines into primitive segments, candidate characters and words are generated by concatenating consecutive segments, and the beam search strategy is used to search all the candidate words. The character classifier and the model combining weights are trained by optimizing a one-vs-all discrimination objective so as to maximize the similarity of true words and minimize the similarity of imposters. In experiments on a test dataset containing 1,015 pages of 180 writers, the proposed methods yields promising performance. For retrieving four-characer words, the recall, precision and F-measure are 92.47%, 83.76% and 87.90%, respectively.