Text versus non-text distinction in online handwritten documents

  • Authors:
  • Emanuel Indermühle;Horst Bunke;Faisal Shafait;Thomas Breuel

  • Affiliations:
  • University of Bern, Bern, Switzerland;University of Bern, Bern, Switzerland;German Research Center for AI (DFKI), Kaiserslautern, Germany;German Research Center for AI (DFKI), Kaiserslautern, Germany

  • Venue:
  • Proceedings of the 2010 ACM Symposium on Applied Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The aim of this paper is to explore how well the task of text vs. nontext distinction can be solved in online handwritten documents using only offline information. Two systems are introduced. The first system generates a document segmentation first. For this purpose, four methods originally developed for machine printed documents are compared: x-y cut, morphological closing, Voronoi segmentation, and whitespace analysis. A state-of-the art classifier then distinguishes between text and non-text zones. The second system follows a bottom-up approach that classifies connected components. Experiments are performed on a new dataset of online handwritten documents containing different content types in arbitrary arrangements. The best system assigns 94.3% of the pixels to the correct class.