Overlapped text segmentation using Markov random field and aggregation

  • Authors:
  • Xujun Peng;Srirangaraj Setlur;Venu Govindaraju;Ramachandrula Sitaram

  • Affiliations:
  • University at Buffalo, SUNY, Amherst, NY;University at Buffalo, SUNY, Amherst, NY;University at Buffalo, SUNY, Amherst, NY;HP Labs India, Bangalore, India

  • Venue:
  • DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Separating machine printed text and handwriting from overlapping text is a challenging problem in the document analysis field and no reliable algorithms have been developed thus far. In this paper, we propose a novel approach for separating handwriting from binary image of overlapped text. Instead of using fixed size training patches, we describe an aggregation method which uses shape context features to extract training samples automatically. We use a Markov Random Field (MRF) to model the overlapped text. The neighbor system is inherited from a coarsening procedure and the prior and likelihood of the MRF is learned based on a distance metric. Experimental results show that the proposed method can achieve 87.97% recall for handwriting and 91.44% recall for machine printed text.