Letters: A new handwritten character segmentation method based on nonlinear clustering

  • Authors:
  • Jun Tan;Jian-Huang Lai;Chang-Dong Wang;Wen-Xian Wang;Xiao-Xiong Zuo

  • Affiliations:
  • School of Mathematics and Computational Science, Sun Yat-sen University, Guangzhou, PR China;School of Information Science and Technology, Sun Yat-sen University, Guangzhou, PR China;School of Information Science and Technology, Sun Yat-sen University, Guangzhou, PR China;Public Security of Yuexiu, Guangzhou, PR China;Public Security of Yuexiu, Guangzhou, PR China

  • Venue:
  • Neurocomputing
  • Year:
  • 2012

Quantified Score

Hi-index 0.01

Visualization

Abstract

In handwritten character recognition, it is a significant step to segment a text line into characters. The unsupervised clustering is a common approach for this task. However, due to the strong overlapping and touch among characters, the separation boundaries between two characters are usually nonlinear, which leads to the failure of the widely used clustering methods such as k-means. To tackle this problem, this paper proposes a new handwritten character segmentation method based on nonlinear clustering methods. In the proposed approach, we first segment the entire text line into strokes, the similarity matrix of which is computed according to stroke gravities. Then, the nonlinear clustering methods are performed on this similarity matrix to obtain cluster labels for these strokes. According to the obtained cluster labels, the strokes are combined to form characters. In this paper, we consider two nonlinear clustering methods, namely, spectral clustering based on Normalized cut (Ncut) and kernel clustering based on Conscience On-Line Learning (COLL). Whereby, two segmentation approaches are proposed with the one using Ncut termed SegNcut, and the one using COLL termed SegCOLL. Experiments on four databases are conducted to demonstrate the effectiveness of our SegNcut and SegCOLL approaches.