Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
A review of segmentation and contextual analysis techniques for text recognition
Pattern Recognition
A Survey of Methods and Strategies in Character Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
An OCR System to Read Two Indian Language Scripts: Bangla and Devnagari (Hindi)
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Towards recognition of degraded words by probabilistic parsing
Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing
Hi-index | 0.00 |
Optical Character Recognition (OCR) systems show poor performance while processing documents like old books or newspapers, Xerox materials, faxed documents, etc. Such documents are considered as degraded documents. One of the important reasons for poor recognition rate for degraded documents is existence of touching or connected characters, which create a major problem for designing an effective character segmentation procedure. In this paper, a new technique is proposed for segmentation of touching characters. The technique is based on fuzzy multifactorial analysis. A predictive algorithm is developed for effectively selecting cut-points to segment touching characters. Initially, our proposed method has been applied for segmenting touching characters that appear in Devnagari (Hindi) and Bangla, two major scripts in Indian sub-continent. The results obtained from a test-set of considerable size show that a high recognition rate can be achieved with a reasonable amount of computations.