An Overview of the Tesseract OCR Engine

  • Authors:
  • R. Smith

  • Affiliations:
  • Google Inc.

  • Venue:
  • ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Tesseract OCR engine, as was the HP Research Prototype in the UNLV Fourth Annual Test of OCR Accuracy[1], is described in a comprehensive overview. Emphasis is placed on aspects that are novel or at least unusual in an OCR engine, including in particular the line finding, features/classification methods, and the adaptive classifier.