Evaluation of weighted Fisher criteria for large category dimensionality reduction in application to Chinese handwriting recognition

  • Authors:
  • Xu-Yao Zhang;Cheng-Lin Liu

  • Affiliations:
  • National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, No. 95 Zhongguancun East Road, Beijing 100190, P.R. China;National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, No. 95 Zhongguancun East Road, Beijing 100190, P.R. China

  • Venue:
  • Pattern Recognition
  • Year:
  • 2013

Quantified Score

Hi-index 0.01

Visualization

Abstract

To improve the class separability of Fisher linear discriminant analysis (FDA) for large category problems, we investigate the weighted Fisher criterion (WFC) by integrating weighting functions for dimensionality reduction. The objective of WFC is to maximize the sum of weighted distances of all class pairs. By setting larger weights for the most confusable classes, WFC can improve the class separation while the solution remains an eigen-decomposition problem. We evaluate five weighting functions in three different weighting spaces in a typical large category problem of handwritten Chinese character recognition. The weighting functions include four based on existing methods, namely, FDA, approximate pairwise accuracy criterion (aPAC), power function (POW), confused distance maximization (CDM), and a new one based on K-nearest neighbors (KNN). All the weighting functions can be calculated in the original feature space, low-dimensional space, or fractional space. Our experiments on a 3,755-class Chinese handwriting database demonstrate that WFC can improve the classification accuracy significantly compared to FDA. Among the weighting functions, the KNN method in the original space is the most competitive model which achieves significantly higher classification accuracy and has a low computational complexity. To further improve the performance, we propose a nonparametric extension of the KNN method from the class level to the sample level. The sample level KNN (SKNN) method is shown to outperform significantly other methods in Chinese handwriting recognition such as the locally linear discriminant analysis (LLDA), neighbor class linear discriminant analysis (NCLDA), and heteroscedastic linear discriminant analysis (HLDA).