The CADAL calligraphic database

  • Authors:
  • Xiafen Zhang;George Nagy

  • Affiliations:
  • Shanghai Maritime University, Shanghai, P. R. China;Rensselaer Polytechnic Institute, Troy, NY

  • Venue:
  • Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A set of 13,351 digitized calligraphic characters were segmented and labeled, with 12,918 characters extracted from 21 books scanned by the CADAL scanning center located in Zhejiang University's library, and 1443 characters from calligraphy works from web sources. The database contains calligraphy from 208 works, some from over 1000 years ago. Statistics are provided on provenance, character size and shape, and label frequency distribution. Specific problems encountered in creating a calligraphic database are illustrated and discussed. Progress is reported on a classifier-based interactive labeling system that halves the human labor necessary to expand the database.