Application of bi-gram driven chinese handwritten character segmentation for an address reading system

  • Authors:
  • Yan Jiang;Xiaoqing Ding;Qiang Fu;Zheng Ren

  • Affiliations:
  • Department of Electronic Engineering, Tsinghua University, Beijing, China;Department of Electronic Engineering, Tsinghua University, Beijing, China;Department of Electronic Engineering, Tsinghua University, Beijing, China;Siemens AG, Konstanz, Germany

  • Venue:
  • DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe a bi-gram driven method for automatic reading of Chinese handwritten mails. In destination address block (DAB) location, text lines are first extracted by connected components analysis. Each candidate line is segmented and recognized by our holistic method, which incorporates mail layout features, recognition confidence and context cost. All these are also taken into consideration to identify the DABs from the candidate text lines. Based on them, street address line and organization name line are determined. At last step, edit distance based string matching is performed against given databases. We also discuss the pretreatment to deal with Chinese address databases consisted of a large amount of vocabularies in order to generate keywords for fast indexing during matching. Detailed experiment results on handwritten mail samples are given in the last section.