Real-Time Document Image Retrieval for a 10 Million Pages Database with a Memory Efficient and Stability Improved LLAH

  • Authors:
  • Kazutaka Takeda;Koichi Kise;Masakazu Iwamura

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a real-time document image retrieval method for a large-scale database with Locally Likely Arrangement Hashing (LLAH). In general, when a database is scaled up, a large amount of memory is required and retrieval accuracy drops due to insufficient discrimination power of features. To solve these problems, we propose three improvements: memory reduction by sampling feature points, improvement of discrimination power by increasing the number of feature dimensions and stabilizing features by reducing redundancy. From the experimental results, we have confirmed that the proposed method realizes 50% memory reduction, and achieves 99.4% accuracy and 38ms processing time for a database of 10 million pages.