Multilingual OCR research and applications: an overview
Proceedings of the 4th International Workshop on Multilingual OCR
Hi-index | 0.00 |
Feature extraction is an important step in off-line handwriting recognition systems to represent raw handwriting in a low-dimensional, tractable feature space. Traditionally, linear feature transforms such as Principle Component Analysis (PCA), Linear Discriminative Analysis (LDA) are commonly used. The assumptions they make, however, usually cannot be satisfied in practice and thus the best performance is not obtained. In this paper, we apply the Region-Dependent non-linear feature Transform (RDT) to handwriting recognition. RDT is one type of non-linear feature transforms which captures the discriminating power much better than traditional linear ones. We justify the effectiveness of RDT on handwriting features using an HMM-based handwriting recognition system on an Arabic handwriting dataset, which consists of 38K pages of handwriting, over 3M handwritten words. Experimental results show that RDT is able to decrease the word error rates (WERs) relatively by 4% to 7% with statistical significance, comparing to two LDA-based baseline systems.