A fast skew detection and correction algorithm for machine printed words in Gurmukhi script

  • Authors:
  • Dharam Veer Sharma;Gurpreet Singh Lehal

  • Affiliations:
  • Punjabi University, Patiala, Punjab, India;Punjabi University, Patiala, Punjab, India

  • Venue:
  • Proceedings of the International Workshop on Multilingual OCR
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

During scanning of documents the image may get skewed because of improper alignment of paper on the scanner, which results in wrong alignment of text on the document image. In some cases the image may even have double skew both at the page level and at word level due to curl near the binding of the book or in old typed/printed documents. Therefore skew detection and correction becomes an indispensable pre-processing task before the recognition of the text. In this paper we have proposed a robust technique for skew detection and correction of isolated words of machine printed Gurmukhi documents. The method presented here relies on the structural properties of words in Indic Script. The algorithm first identifies skewed word and then corrects the skewed words only. According to the proposed technique, isolated words having straight headline are not considered skewed but when length of headline is less than a threshold value then the word may be skewed and becomes target for correction. The algorithm can be equally effective for machine printed documents of those scripts where headline is used to connect characters of a word.