Markov Random Field Based Text Identification from Annotated Machine Printed Documents

Authors:
Xujun Peng;Srirangaraj Setlur;Venu Govindaraju;Ramachandrula Sitaram;Kiran Bhuvanagiri
Affiliations:
-;-;-;-;-
Venue:
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Year:
2009

Citing 0
Cited 3

Overlapped text segmentation using Markov random field and aggregation

DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Pixel accurate document image content extraction

Proceedings of the 2011 ACM Symposium on Applied Computing
Using a boosted tree classifier for text segmentation in hand-annotated documents

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we describe an approach to segment handwritten text, machine printed text and noise from annotated machine printed documents. Three categories of word level features are extracted. We use a modified K-Means clustering algorithm for classification followed by a relabeling procedure using Markov Random Field(MRF) based on a concept of neighboring patches and Belief Propagation(BP)rules. Experimental results on an imbalanced data set show that our approach achieves an overall recall of 96.33% .