Character Extraction from Noisy Background for an Automatic Reference System

  • Authors:
  • Hideyuki Negishi;Jien Kato;Hiroyuki Hase;Toyohide Watanabe

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is important to provide digitized manuscripts of old literatures(in page image form) and their electronic texts (in full text form), with an automatically referring mechanism between the images and the texts, on the internet. As an essential step for creating such an automatic reference system, this paper describes the issue of extracting character areas from page images of old handwriting manuscripts. Page images of old manuscripts are usually terribly dirty and considerable large in size. To overcome the first problem, we propose a new effective method for separating characters from noisy background, since conventional threshold selection techniques are inadequate to cope with the image where the gray levels of the character parts are overlapped by that of the background. To solve the second problem, we propose an approach based on a downscaled image and a recursive labeling method for word extraction. This approach is suitable for large size images because it has the advantage of saving memory and reducing processing time.