Word Segmentation in Handwritten Korean Text Lines Based on Gap Clustering Techniques

  • Authors:
  • Affiliations:
  • Venue:
  • ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Abstract: We propose a word segmentation method for handwritten Korean text lines. It uses gap information to separate a text line into word units, where the gap is defined as a white-run obtained after a vertical projection of the line image. Each gap is classified into a between-word gap or a within-word gap using a clustering technique. We take up three gap metrics - BB, RLE and CH which are known to have superior performance in Roman-style word segmentation, and three clustering techniques - average linkage method, modified MAX method and sequential clustering. An experiment with 498 text line images extracted from live mail pieces has shown that the best performance is obtained by the sequential clustering technique using all three gap metrics.