A Comparison of Clustering Methods for Writer Identification and Verification

  • Authors:
  • Marius Bulacu;Lambert Schomaker

  • Affiliations:
  • AI Institute, Groningen University, The Netherlands;AI Institute, Groningen University, The Netherlands

  • Venue:
  • ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

An effective method for writer identification and veri- fication is based on assuming that each writer acts as a stochastic generator of ink-trace fragments, or graphemes. The probability distribution of these simple shapes in a given handwriting sample is characteristic for the writer and is computed using a common codebook of graphemes obtained by clustering. In previous studies we used contours to encode the graphemes, in the current paper we explore a complementary shape representation using normalized bitmaps. The most important aim of the current work is to compare three different clustering methods for generating the grapheme codebook: k-means, Kohonen SOM 1D and 2D. Large scale computational experiments show that the proposed method is robust to the underlying shape representation used (whether contours or normalized bitmaps), to the size of codebook used (stable performance for sizes from 102 to 2.5脳103) and to the clustering method used to generate the codebook (essentially the same performance was obtained for all three clustering methods).