A robust two level classification algorithm for text localization in documents

  • Authors:
  • R. Kandan;Nirup Kumar Reddy;K. R. Arvind;A. G. Ramakrishnan

  • Affiliations:
  • MILE Laboratory, Electrical Engineering Department, Indian Institute of Science, Bangalore, India;MILE Laboratory, Electrical Engineering Department, Indian Institute of Science, Bangalore, India;MILE Laboratory, Electrical Engineering Department, Indian Institute of Science, Bangalore, India;MILE Laboratory, Electrical Engineering Department, Indian Institute of Science, Bangalore, India

  • Venue:
  • ISVC'07 Proceedings of the 3rd international conference on Advances in visual computing - Volume Part II
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a two level classification algorithm to discriminate the handwritten elements from the printed text in a printed document. The proposed technique is independent of size, slant, orientation, translation and other variations in handwritten text. At the first level of classification, we use two classifiers and present a comparison between the nearest neighbour classifier and Support Vector Machines(SVM) classifier to localize the handwritten text. The features that are extracted from the document are seven invariant central moments and based on these features, we classify the text as hand-written. At the second level, we use Delaunay triangulation to reclassify the misclassified elements. When Delaunay triangulation is imposed on the centroid points of the connected components, we extract features based on the triangles and reclassify the misclassified elements. We remove the noise components in the document as part of the pre-processing step.