Letters: A new handwritten character segmentation method based on nonlinear clustering

Authors:
Jun Tan;Jian-Huang Lai;Chang-Dong Wang;Wen-Xian Wang;Xiao-Xiong Zuo
Affiliations:
School of Mathematics and Computational Science, Sun Yat-sen University, Guangzhou, PR China;School of Information Science and Technology, Sun Yat-sen University, Guangzhou, PR China;School of Information Science and Technology, Sun Yat-sen University, Guangzhou, PR China;Public Security of Yuexiu, Guangzhou, PR China;Public Security of Yuexiu, Guangzhou, PR China
Venue:
Neurocomputing
Year:
2012

Citing 17
Cited 0

Segmenting handwritten Chinese characters based on heuristic merging of stroke bounding boxes and dynamic programming

Pattern Recognition Letters
Recognition-based handwritten Chinese character segmentation using a probabilistic Viterbi algorithm

Pattern Recognition Letters
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
A fast parallel algorithm for thinning digital patterns

Communications of the ACM
A Segmentation Method for Touching Japanese Handwritten Characters Based on Connecting Condition of Line

ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Statistical Character Structure Modeling and Its Application to Handwritten Chinese Character Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
A novel and quick SVM-based multi-class classifier

Pattern Recognition
Text-Independent Writer Identification and Verification Using Textural and Allographic Features

IEEE Transactions on Pattern Analysis and Machine Intelligence
Corpus-based HIT-MW database for offline recognition of general-purpose Chinese handwritten text

International Journal on Document Analysis and Recognition
Automatic Writer Identification of Ancient Greek Inscriptions

IEEE Transactions on Pattern Analysis and Machine Intelligence
Text line and word segmentation of handwritten documents

Pattern Recognition
HCL2000 - A Large-scale Handwritten Chinese Character Database for Handwritten Character Recognition

ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Signature Detection and Matching for Document Image Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Conscience On-line Learning Approach for Kernel-Based Clustering

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Energy based competitive learning

Neurocomputing
Conscience online learning: an efficient approach for robust kernel-based clustering

Knowledge and Information Systems
-NS: A Classifier by the Distance to the Nearest Subspace

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.01

Visualization

Abstract

In handwritten character recognition, it is a significant step to segment a text line into characters. The unsupervised clustering is a common approach for this task. However, due to the strong overlapping and touch among characters, the separation boundaries between two characters are usually nonlinear, which leads to the failure of the widely used clustering methods such as k-means. To tackle this problem, this paper proposes a new handwritten character segmentation method based on nonlinear clustering methods. In the proposed approach, we first segment the entire text line into strokes, the similarity matrix of which is computed according to stroke gravities. Then, the nonlinear clustering methods are performed on this similarity matrix to obtain cluster labels for these strokes. According to the obtained cluster labels, the strokes are combined to form characters. In this paper, we consider two nonlinear clustering methods, namely, spectral clustering based on Normalized cut (Ncut) and kernel clustering based on Conscience On-Line Learning (COLL). Whereby, two segmentation approaches are proposed with the one using Ncut termed SegNcut, and the one using COLL termed SegCOLL. Experiments on four databases are conducted to demonstrate the effectiveness of our SegNcut and SegCOLL approaches.