Separating Handwritten Material from Machine Printed Text Using Hidden Markov Models

  • Authors:
  • Affiliations:
  • Venue:
  • ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Abstract: In this paper, we address the problem of separating handwritten annotations from machine printed text within a document. We present an algorithm that is based on the theory of hidden Markov models (HMM) to distinguish between machine printed and handwritten materials. No OCR results are required prior to or during the process and classification is performed on a word level. Handwritten annotations are not limited to marginal areas as the approach can deal with document images having handwritten annotations overlaying on machine printed text and shown to be promising in our experiments. Experimental results show that the proposed method can achieve 72:19% recall for fully extracted handwritten words and 90:37% for partially extracted. The precision of extracting handwritten words reaches 92:86%.