A holistic methodology for keyword search in historical typewritten documents

  • Authors:
  • Basilis Gatos;Thomas Konidaris;Ioannis Pratikakis;Stavros J. Perantonis

  • Affiliations:
  • Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Research Center "Demokritos", Athens, Greece;Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Research Center "Demokritos", Athens, Greece;Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Research Center "Demokritos", Athens, Greece;Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Research Center "Demokritos", Athens, Greece

  • Venue:
  • SETN'06 Proceedings of the 4th Helenic conference on Advances in Artificial Intelligence
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a novel holistic methodology for keyword search in historical typewritten documents combining synthetic data and user's feedback. The holistic approach treats the word as a single entity and entails the recognition of the whole word rather than of individual characters. Our aim is to search for keywords typed by the user in a large collection of digitized typewritten historical documents. The proposed method is based on: (i) creation of synthetic image words; (ii) word segmentation using dynamic parameters; (iii) efficient hybrid feature extraction for each image word and (iv) a retrieval procedure that is optimized by user's feedback. Experimental results prove the efficiency of the proposed approach.