Recognition of Multi-oriented Touching Characters in Graphical Documents

  • Authors:
  • Partha Pratim Roy;Umapada Pal;Josep Lladós

  • Affiliations:
  • -;-;-

  • Venue:
  • ICVGIP '08 Proceedings of the 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Touching characters are major problem of achieving higher recognition rate in Optical Character Recognition (OCR). Present OCR systems do not perform well when adjacent characters touch. If characters are touched in graphical documents (e.g. map) then such touching string recognition is more difficult because in such documents touching characters appear in multi-oriented direction. In this paper, we present a scheme towards the recognition of English two-character multi-oriented touching strings. When two or more characters touch, they generate a big cavity region at the background portion and we used this background information in our scheme. To handle the background information, convex hull is used. In this scheme, at first, a set of initial segmentation points is predicted based on the concave residues of the convex hull of the touching characters. Next, based on the initial points, we select some candidate segmentation lines. Finally the recognition confidence of two sub-images of a touching string, obtained from each candidate segmentation line is computed. The candidate segmentation line from which we get optimum confidence is the actual segmentation line and the corresponding characters in favour of which the two segmentation parts show optimum confidence is the recognition result of the touching string. To compute the recognition confidence, SVM classifier is used. The features used in the SVM are invariant to character orientation. Circular ring and convex hull ring based approach has been used along with angular information of the contour pixels of the character to make the feature rotation invariant. From the experiment we obtained encouraging result.