A fast algorithm of address lines extraction on complex Chinese mail pieces

  • Authors:
  • Tong Liu;Xiaoqing Ding;Qiang Fu;Zheng Ren

  • Affiliations:
  • State Key Lab of Intelligent Technology and Systems, Tsinghua University, Beijing, P.R. China;State Key Lab of Intelligent Technology and Systems, Tsinghua University, Beijing, P.R. China;State Key Lab of Intelligent Technology and Systems, Tsinghua University, Beijing, P.R. China;Siemens AG, Buecklestrasse, Konstanz, Germany

  • Venue:
  • SPPRA'06 Proceedings of the 24th IASTED international conference on Signal processing, pattern recognition, and applications
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

A fast and efficient method is presented to extract address lines on both machine printed and handwritten Chinese mail envelopes. The algorithm is based on a bottom-up approach. First, we select out text blocks from connected components (CCs) and immediately group the text blocks into the initial lines. Then, the average text block features are computed to validate the initial text lines and guide an iterative split and merge process. Lines are split by merging the text CCs in detail according to criteria for similarity and consistency of neighborhood text blocks. Particularly, some non-text blocks within the lines are recovered if they are similar with other text blocks. A skew detection and, accordingly, deskew step is followed. We have tested the performance of our methods on a large mail sample test deck with different categories of envelopes, and an obvious improvement both on accuracy and on computation time could be achieved compared to our previous system.