Semi-Supervised Handwritten Word Segmentation Using Character Samples Similarity Maximization and Evolutionary Algorithm

  • Authors:
  • Jerzy Sas;Urszula Markowska-Kaczmar

  • Affiliations:
  • Wroclaw University of Technology, Poland;Wroclaw University of Technology, Poland

  • Venue:
  • CISIM '07 Proceedings of the 6th International Conference on Computer Information Systems and Industrial Management Applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, the problem of semi-supervised handwriting segmentation into isolated character images is considered. Semi-supervised segmentation means here that the character sequence constituting a word presented on the image is known, but the character boundaries are not given and need to be automatically determined. The semi-supervised word segmentation can be useful in analytic writer-dependent approach to handwriting recognition, where the training set for personalized character classifier must be created for each writer from the text corpus consisting of text samples of an individual writer. The method described here over-segments the word images into sequences of graphemes in the first step. Then such grapheme sequences subdivision is sought, which results in the hypothetical character images sets maximizing average similarity in subsets corresponding to characters from the alphabet. It leads to the combinatorial optimization problem with enormously large search space. The suboptimal solution of this problem can be found using evolutionary algorithm. The sample character images extracted in this way can be used to train character classifiers. Some preliminary results of handwriting segmentation are presented in the paper and compared with fully supervised segmentation carried out by a human.