Keyword spotting in unconstrained handwritten Chinese documents using contextual word model
Image and Vision Computing
Hi-index | 0.00 |
This paper proposes a method for keyword spotting in offline Chinese handwritten documents using a statistical model. On a text query word, the method measures the similarity between the query word and every candidate word in the document by combining a character classifier and four classifiers characterizing the geometric contexts. By over-segmenting text lines into primitive segments, candidate characters and words are generated by concatenating consecutive segments, and the beam search strategy is used to search all the candidate words. The character classifier and the model combining weights are trained by optimizing a one-vs-all discrimination objective so as to maximize the similarity of true words and minimize the similarity of imposters. In experiments on a test dataset containing 1,015 pages of 180 writers, the proposed methods yields promising performance. For retrieving four-characer words, the recall, precision and F-measure are 92.47%, 83.76% and 87.90%, respectively.