Content-Based Indexing and Retrieval Method of Chinese Document Images

Authors:
Yaodong He;Zao Jiang;Bing Liu;Hong Zhao
Affiliations:
-;-;-;-
Venue:
ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Year:
1999

Citing 0
Cited 6

Imaged Document Text Retrieval Without OCR

IEEE Transactions on Pattern Analysis and Machine Intelligence
Information Retrieval in Document Image Databases

IEEE Transactions on Knowledge and Data Engineering
Document Image Retrieval Based on Density Distribution Feature and Key Block Feature

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Feature string-based intelligent information retrieval from Tamil document images

International Journal of Computer Applications in Technology
An indexed full-text search method of printed document images with an M-tree

RIAO '10 Adaptivity, Personalization and Fusion of Heterogeneous Information
Keyword spotting in unconstrained handwritten Chinese documents using contextual word model

Image and Vision Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In Chinese information retrieval, it is easy to index a Chinese text document for retrieval. We just need to segment the text document into phrases. When the document is Chinese document image (non-ASCII file), we may first convert the document image into text file by using Chinese optical character recognition (OCR) technology, and then index the document by using information retrieval algorithm. However, OCR needs longer time, which can influence retrieval efficiency. This paper proposes an index method based on stroke density code. First segment the document image to get all the Chinese character images, then calculate stroke density of each Chinese character image, and at last attain stroke density code of the character image. The index method has the advantage of speed and robustness to noise. In addition, this paper also offers retrieval method for Chinese document image based on the index technology. We specially discuss index and retrieval method for duplicate detection. We have validated the validity of the index method through its application to keyword spotting and duplicate detection.