Word Spotting in Chinese Document Images without Layout Analysis

  • Authors:
  • Yue Lu;Chew Lim Tan

  • Affiliations:
  • -;-

  • Venue:
  • ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 3 - Volume 3
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

An approach to searching user-specified words/phrases in Chinese document images, without the requirements of layout analysis, is proposed in this paper. Bounding boxes of Chinese character images are fir st determined using connected component analysis. Next, a suitable character from the user-specified word/phrase is chosen as the initial character to search for a matching candidate in the document. Once a matched candidate is found, its adjacent characters in the horizontal and vertical directions are examined for matching with other corresponding characters in the user-specified word/phrase, subject to the constraints of positional relation and size similarity. The character matching is done in two stages. The coarse matching is carried out based on the stroke density features. A weighted Hausdorff distance (WHD) is proposed for the second matching phase. Experimental results show that the proposed method can effectively search the user-specified Chinese word/phrase from horizontal or vertical text lines of document images.