Recovery of Distorted Document Images from Bound Volumes

  • Authors:
  • Affiliations:
  • Venue:
  • ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Abstract: Recovery of document images scanned from thick bound volumes is necessary for the purpose of human reading and text retrieval. The main problem with scanning of bound volumes is that there always occurs perspective distortion. Such distortion causes two sources of degradation for the scanned images - 1) shadow at the book spine area, and 2) warping of the words in the shadow. In this paper, we have developed a restoration system to solve these two problems. First, the boundary between the shadow and the clean area is detected. Then the system applies a modified Niblack's method to remove the shadow. The system uses a connected component analysis to help improve the noise reduction and adjust the location and orientation of the warped word in the shadow area, i.e. the words within the boundary detected earlier. The implementation results for each step are presented. Our system will be used in the text retrieval projects for National Archives of Singapore and NUS Digital Library.