Overlapped text segmentation using Markov random field and aggregation
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Pixel accurate document image content extraction
Proceedings of the 2011 ACM Symposium on Applied Computing
Using a boosted tree classifier for text segmentation in hand-annotated documents
Pattern Recognition Letters
Hi-index | 0.00 |
In this paper, we describe an approach to segment handwritten text, machine printed text and noise from annotated machine printed documents. Three categories of word level features are extracted. We use a modified K-Means clustering algorithm for classification followed by a relabeling procedure using Markov Random Field(MRF) based on a concept of neighboring patches and Belief Propagation(BP)rules. Experimental results on an imbalanced data set show that our approach achieves an overall recall of 96.33% .