Image-Based Document Vectors for Text Retrieval

Authors:
Affiliations:
Venue:
ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 4
Year:
2000

Citing 0
Cited 2

Information Retrieval in Document Image Databases

IEEE Transactions on Knowledge and Data Engineering
Feature string-based intelligent information retrieval from Tamil document images

International Journal of Computer Applications in Technology

Quantified Score

Hi-index	0.01

Visualization

Abstract

We propose a method for constructing a vector for a document image to represent its content to facilitate text retrieval. The method is based on an N-Gram algorithm for text similarity measure based on the frequency of occurrence of n-character strings appearing in the electronic text. Instead of using ASCII values, the present study investigates the use of character images to obtain the document vector and has found promising results for use in our news article retrieval project.