A robust approach to text line grouping in online handwritten Japanese documents

  • Authors:
  • Xiang-Dong Zhou;Da-Han Wang;Cheng-Lin Liu

  • Affiliations:
  • National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 95 Zhongguancun East Road, Beijing 100190, PR China;National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 95 Zhongguancun East Road, Beijing 100190, PR China;National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 95 Zhongguancun East Road, Beijing 100190, PR China

  • Venue:
  • Pattern Recognition
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this paper, we present an effective approach for grouping text lines in online handwritten Japanese documents by combining temporal and spatial information. With decision functions optimized by supervised learning, the approach has few artificial parameters and utilizes little prior knowledge. First, the strokes in the document are grouped into text line strings according to off-stroke distances. Each text line string, which may contain multiple lines, is segmented by optimizing a cost function trained by the minimum classification error (MCE) method. At the temporal merge stage, over-segmented text lines (caused by stroke classification errors) are merged with a support vector machine (SVM) classifier for making merge/non-merge decisions. Last, a spatial merge module corrects the segmentation errors caused by delayed strokes. Misclassified text/non-text strokes (stroke type classification precedes text line grouping) can be corrected at the temporal merge stage. To evaluate the performance of text line grouping, we provide a set of performance metrics for evaluating from multiple aspects. In experiments on a large number of free form documents in the Tokyo University of Agriculture and Technology (TUAT) Kondate database, the proposed approach achieves the entity detection metric (EDM) rate of 0.8992 and the edit-distance rate (EDR) of 0.1114. For grouping of pure text strokes, the performance reaches EDM of 0.9591 and EDR of 0.0669.