An EMD-based recognition method for Chinese fonts and styles

  • Authors:
  • Zhihua Yang;Lihua Yang;Dongxu Qi;Ching Y. Suen

  • Affiliations:
  • Information Science and Technology School, Guangdong University of Business Studies, Guangzhou, PR China;Department of Scientific Computing and Computer Applications, School of Mathematics and Computing Science, Sun Yat-sen University, Guangzhou 510275, PR China;School of Information Science and Technology, Sun Yat-sen University, Guangzhou 510275, PR China and Faculty of Information Technology, Macao University of Science and Technology, Macao;Center for Pattern Recognition and Machine Intelligence, Concordia University, Montreal, Canada H3G 1M8

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2006

Quantified Score

Hi-index 0.10

Visualization

Abstract

This paper presents a novel method to recognize Chinese fonts based on empirical mode decomposition (EMD). By analyzing and comparing a great number of Chinese characters, five basic strokes have been selected to characterize the stroke features of Chinese fonts. Based on them, stroke feature sequences of a given text block are calculated. By decomposing them with EMD, some intrinsic mode functions are produced and then the first two, which are of the highest frequencies, are used to produce the so-called stroke high frequency energies, which is the average energy of the two intrinsic mode functions over the length of the sequence. By calculating the stroke high frequency energies for all the five basic strokes and combining them with the averages of the five residues, which are called stroke low frequency energies, a 10-dimensional feature vector is formed. Finally, the minimum distance classifier is used to recognize the fonts and encouraging experimental results have been obtained. The main advantages of our algorithm are that (1) the feature dimension is very low; (2) less samples are needed to train the classifier; (3) finally and most importantly, it is the first attempt to apply the new theory of Hilbert-Huang transform to document analysis and recognition.