Markov Random Field Based Text Identification from Annotated Machine Printed Documents

  • Authors:
  • Xujun Peng;Srirangaraj Setlur;Venu Govindaraju;Ramachandrula Sitaram;Kiran Bhuvanagiri

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe an approach to segment handwritten text, machine printed text and noise from annotated machine printed documents. Three categories of word level features are extracted. We use a modified K-Means clustering algorithm for classification followed by a relabeling procedure using Markov Random Field(MRF) based on a concept of neighboring patches and Belief Propagation(BP)rules. Experimental results on an imbalanced data set show that our approach achieves an overall recall of 96.33% .