Eigenspace Method for Text Retrieval in Historical Document Images

  • Authors:
  • Kengo Terasawa;Takeshi Nagasaki;Toshio Kawashima

  • Affiliations:
  • Future University-Hakodate,Japan;Future University-Hakodate,Japan;Future University-Hakodate,Japan

  • Venue:
  • ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

A new method for text retrieval that does not need segmentation is described. Segmenting the images in historical documents into individual characters is difficult. Therefore, the conventional OCR method, which uses segmentation, does not work well. Our method instead divides the text image into a sequence of small slits. The image region that corresponds to the query image region is retrieved by solving the matching problem of these sequences. Applying the eigenspace method to the slit images enables us to solve the matching problem efficiently. Moreover, using dynamic time warping (DTW) further improves the results. Our method has higher accuracy than the simple template matching method, and it has far higher efficiency in computational cost.