Document Image Retrieval Based on Density Distribution Feature and Key Block Feature

Authors:
Hong Liu;Suoqian Feng;Hongbin Zha;Xueping Liu
Affiliations:
Peking University, China;Peking University, China;Peking University, China;Ricoh Co., Japan
Venue:
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Year:
2005

Citing 4
Cited 4

Imaged Document Text Retrieval Without OCR

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Robust Skew Detection Algorithm for Grayscale Document Image

ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Content-Based Indexing and Retrieval Method of Chinese Document Images

ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Document Filtering for Fast Approximate String Matching of Errorneous Text

ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition

Exploring digital libraries with document image retrieval

ECDL'07 Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries
Amharic document image retrieval using morphological coding

Proceedings of the International Conference on Management of Emergent Digital EcoSystems
Near-duplicate document image matching: A graphical perspective

Pattern Recognition
A new thresholding algorithm for document images based on the perception of objects by distance

Integrated Computer-Aided Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Document image retrieval is an important part of many document image processing systems such as paperless office systems, digital libraries and so on. Its task is to help users find out the most similar document images from a document image database. For developing a system of document image retrieval among different resolutions, different formats document images with hybrid characters of multiple languages, a new retrieval method based on document image density distribution features and key block features is proposed in this paper. Firstly, the density distribution and key block features of a document image are defined and extracted based on documents' print-core. Secondly, the candidate document images are attained based on the density distribution features. Thirdly, to improve reliability of the retrieval results, a confirmation procedure using key block features is applied to those candidates. Experimental results on a large scale document image database, which contains 10385 document images, show that the proposed method is efficient and robust to retrieve different kinds of document images in real time.