A word shape coding method for camera-based document images

Authors:
Linlin Li;Chew Lim Tan
Affiliations:
National University of Singapore, Singapore, Singapore;National University of Singapore, Singapore, Singapore
Venue:
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2008

Citing 5
Cited 0

The indexing and retrieval of document images: a survey

Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
Imaged Document Text Retrieval Without OCR

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Document Spectrum for Page Layout Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Information Retrieval in Document Image Databases

IEEE Transactions on Knowledge and Data Engineering
Affine invariant comparison of point-sets using convex hulls and hausdorff distances

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper reports a word shape coding method to facilitate retrieval of camera-based document images without OCR. Due to perspective distortion, many reported word shape coding methods fail on camera-based images. In this paper, the problem is addressed by approximating the perspective transformation with an affine transformation, and employing an affine invariant, namely length ratio, to represent the connected components. Components in a document image are classified into a few clusters, each of which is assigned with a representative symbol. Retrieval are based on "words" comprising of symbols. The experiment results showed that the proposed method achieved an average retrieval precision of 93.43% and recall of 94.22%.