Text line segmentation of unconstrained handwritten Kannada script
Proceedings of the 2011 International Conference on Communication, Computing & Security
A survey of handwritten document pre-processing techniques and customizing for Indic script
Proceedings of the International Conference & Workshop on Emerging Trends in Technology
Handwriting Recognition in Indian Regional Scripts: A Survey of Offline Techniques
ACM Transactions on Asian Language Information Processing (TALIP)
International Journal of Digital Library Systems
Hi-index | 0.00 |
Preprocessing in handwritten text OCR involves line, word and character segmentation. This paper deals with text line identification of handwritten Indian scripts, especially of Bangla, as well as English, Hindi, Malayalam, etc. Here, a new dual method based on interdependency between text-line and inter-line gap is proposed. The method draws curves simultaneously through the text and inter-line gap points found from strip-wise histogram peaks and inter-peak valleys. The curves start from left and move right while one type of points guides the curve of other type so that the curves do not intersect. Then these curves are allowed to iteratively evolve so that the text-line curves cross more character strokes while inter-line curves cross less character strokes and yet keep the curves as straight as possible. After several iterations, the curves stabilize and define the final text-lines and inter-line gaps. The approach works well on text of different scripts with various geometric layouts, including poetry.