Restoring Chinese documents images based on text boundary lines

  • Authors:
  • Hong Liu;Runwei Ding

  • Affiliations:
  • Key Laboratory of Machine Perception and Intelligence, Key Laboratory of Integrated Microsystem, Shenzhen Graduate School, Peking University, China;Key Laboratory of Integrated Microsystem, Key Laboratory of Machine Perception and Intelligence, Shenzhen Graduate School, Peking University, China

  • Venue:
  • SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distortion always appears in document images while scanning thick bound volumes. There are two kinds of distortion for the scanned grayscale images, shadow appears at the volumes' spine area, and warping of the words occurs in the shadow. In this paper, a novel text boundary lines based method for efficient restoration of warped scanning Chinese document images is presented. We first detect on which side of an image the shadow lays by row grayscale analysis method. Then the shadow is removed by a modified Niblack's algorithm. In order to detect the warped feature, a text boundary lines' detection method is proposed. Finally, an adjustment method based on the text boundary lines is carried to restore the warped words. Experiments on 400 various scanning Chinese document images are implemented. The improvement on average character recall is 11.92% to 14.89%. Experiments show that the proposed restoration method is efficient for Chinese documents with both text and non-text regions.