Robust text and drawing segmentation algorithm for historical documents

  • Authors:
  • Rafi Cohen;Abedelkadir Asi;Klara Kedem;Jihad El-Sana;Itshak Dinstein

  • Affiliations:
  • Ben-Gurion University of the Negev;Ben-Gurion University of the Negev;Ben-Gurion University of the Negev;Ben-Gurion University of the Negev;Ben-Gurion University of the Negev

  • Venue:
  • Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a method to segment historical document images into regions of different content. First, we segment text elements from non-text elements using a binarized version of the document. Then, we refine the segmentation of the non-text regions into drawings, background and noise. At this stage, spatial and color features are exploited to guarantee coherent regions in the final segmentation. Experiments show that the suggested approach achieves better segmentation quality with respect to other methods. We examine the segmentation quality on 252 pages of a historical manuscript, for which the suggested method achieves about 92% and 90% segmentation accuracy of drawings and text elements, respectively.