Bangla date field extraction in offline handwritten documents

  • Authors:
  • Ranju Mandal;Partha Pratim Roy;Umapada Pal

  • Affiliations:
  • Indian Statistical Institute, Kolkata, India;Université François Rabelais, Tours, France;Indian Statistical Institute, Kolkata, India

  • Venue:
  • Proceeding of the workshop on Document Analysis and Recognition
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Date is a useful information for various application (e.g. date wise document indexing) and automatic extraction of date information involves difficult challenges due to writing styles of different individuals, touching characters and confusion among identification of numerals, punctuation and texts. In this paper, we present a framework for indexing/retrieval of Bangla date patterns from handwritten documents. The method first classifies word components of each text line into month and non-month class using word level feature. Next, non-month words are segmented into individual components and classified into one of text, digit or punctuation. Using this information of word and character level components, the date patterns are searched. First using voting approach and then using regular expression we detect the candidate lines for numeric and semi-numeric date. Dynamic Time Warping (DTW) matching of profile based features is used for classification of month/non-month words. Numerals and punctuations are classified using gradient based feature and SVM classifier. The experiment is performed on Bangla handwritten dataset and the results demonstrate the effectiveness of the proposed system.