Pre-processing Methods for Handwritten Arabic Documents

  • Authors:
  • Faisal Farooq;Venu Govindaraju;Michael Perrone

  • Affiliations:
  • University at Buffalo, Amherst, NY;University at Buffalo, Amherst, NY;IBM TJ Watson Center, Yorktown, NY

  • Venue:
  • ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In order to improve the readability and the automatic recognition of handwritten document images, preprocessing steps are imperative. These steps in addition to conventional steps of noise removal and filtering include text normalization such as baseline correction, slant normalization and skew correction. These steps make the feature extraction process more reliable and effective. Recently Arabic handwriting recognition has received some attention from the research community. Due to the unique nature of the script, the conventional methods do not prove to be effective. In our work, we describe an orientation independent technique for baseline detection of Arabic words. In addition to that we describe, in the rest of the paper, our techniques for slant normalization, slope correction, line and word separation in handwritten Arabic documents. We show how the baseline can be exploited for slope and skew correction before proceeding with the steps of line and word separation.