Handwritten document image segmentation into text lines and words

  • Authors:
  • Vassilis Papavassiliou;Themos Stafylakis;Vassilis Katsouros;George Carayannis

  • Affiliations:
  • Institute for Language and Speech Processing of R.C. "Athena" Artemidos 6 & Epidavrou, GR-151 25 Maroussi, Greece and National Technical University of Athens, School of Electrical and Computer Eng ...;Institute for Language and Speech Processing of R.C. "Athena" Artemidos 6 & Epidavrou, GR-151 25 Maroussi, Greece and National Technical University of Athens, School of Electrical and Computer Eng ...;Institute for Language and Speech Processing of R.C. "Athena" Artemidos 6 & Epidavrou, GR-151 25 Maroussi, Greece;Institute for Language and Speech Processing of R.C. "Athena" Artemidos 6 & Epidavrou, GR-151 25 Maroussi, Greece and National Technical University of Athens, School of Electrical and Computer Eng ...

  • Venue:
  • Pattern Recognition
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Two novel approaches to extract text lines and words from handwritten document are presented. The line segmentation algorithm is based on locating the optimal succession of text and gap areas within vertical zones by applying Viterbi algorithm. Then, a text-line separator drawing technique is applied and finally the connected components are assigned to text lines. Word segmentation is based on a gap metric that exploits the objective function of a soft-margin linear SVM that separates successive connected components. The algorithms tested on the benchmarking datasets of ICDAR07 handwriting segmentation contest and outperformed the participating algorithms.