A robust page segmentation method for Persian/Arabic documents

  • Authors:
  • M. Hassan Shirali-Shahreza;Sajad Shirali-Shahreza

  • Affiliations:
  • Computer Engineering Department, Yazd University, Yazd, Iran;Computer Engineering Department, Sharif University of Technology, Tehran, Iran

  • Venue:
  • ISCGAV'05 Proceedings of the 5th WSEAS International Conference on Signal Processing, Computational Geometry & Artificial Vision
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Optical Character Recognition (OCR) softwares are widely used in the office automation systems. One of the first steps in the recognition of the documents is to segment the input image. Various methods have been offered for the English language. For the Persian/Arabic Language, however, no complete method has been found yet. In this paper we present a new page segmentation method for Persian/Arabic printed texts. This method has been inspired by the effect of the spreading of ink on paper. One of the most important characteristics of this method is its non-sensitivity to rotation.