Improving handwritten keyword spotting with self-training

  • Authors:
  • Volkmar Frinken;Andreas Fischer;Horst Bunke

  • Affiliations:
  • Institute for Computer Science and Applied Mathematics, Bern, Switzerland;Institute for Computer Science and Applied Mathematics, Bern, Switzerland;Institute for Computer Science and Applied Mathematics, Bern, Switzerland

  • Venue:
  • Proceedings of the 2011 ACM Symposium on Applied Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Keyword spotting is the task of retrieving all instances of a given keyword in a set of documents. In the current paper we consider the problem of keyword spotting in handwritten text. This is a difficult problem due to the great variety of different writing styles. Recently, learning based keyword spotting systems have been shown to outperform traditional approaches, at the cost of requiring large amounts of training data. The training data need to be manually labeled, which is tedious and time-consuming. In this paper we propose to exploit unlabeled data via semi-supervised learning to reduce the need for labeled data when training a keyword spotting system. We demonstrate, on historic as well as modern handwritten text, that the performance of a learning based keyword spotting system can be dramatically increased using this approach.