On OCR of Degraded Documents Using Fuzzy Multifactorial Analysis

  • Authors:
  • U. Garain;B. B. Chaudhuri

  • Affiliations:
  • -;-

  • Venue:
  • AFSS '02 Proceedings of the 2002 AFSS International Conference on Fuzzy Systems. Calcutta: Advances in Soft Computing
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Optical Character Recognition (OCR) systems show poor performance while processing documents like old books or newspapers, Xerox materials, faxed documents, etc. Such documents are considered as degraded documents. One of the important reasons for poor recognition rate for degraded documents is existence of touching or connected characters, which create a major problem for designing an effective character segmentation procedure. In this paper, a new technique is proposed for segmentation of touching characters. The technique is based on fuzzy multifactorial analysis. A predictive algorithm is developed for effectively selecting cut-points to segment touching characters. Initially, our proposed method has been applied for segmenting touching characters that appear in Devnagari (Hindi) and Bangla, two major scripts in Indian sub-continent. The results obtained from a test-set of considerable size show that a high recognition rate can be achieved with a reasonable amount of computations.