The indexing and retrieval of document images: a survey
Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
Imaged Document Text Retrieval Without OCR
IEEE Transactions on Pattern Analysis and Machine Intelligence
Model-Based Information Extraction Method Tolerant of OCR Errors for Document Images
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
A search engine for historical manuscript images
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval in Document Image Databases
IEEE Transactions on Knowledge and Data Engineering
Textual indexation of ancient documents
Proceedings of the 2005 ACM symposium on Document engineering
Keyword-guided word spotting in historical printed documents using synthetic data and user feedback
International Journal on Document Analysis and Recognition
Retrieval from document image collections
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Image retrieval systems based on compact shape descriptor and relevance feedback information
Journal of Visual Communication and Image Representation
Amharic document image retrieval using morphological coding
Proceedings of the International Conference on Management of Emergent Digital EcoSystems
Hi-index | 0.00 |
In this paper, a system is presented that locates words in document image archives. This technique performs the word matching directly in the document images bypassing character recognition and using word images as queries. First, it makes use of document image processing techniques, in order to extract powerful features for the description of the word images. The features used for the comparison are capable of capturing the general shape of the query, and escape details due to noise or different fonts. In order to demonstrate the effectiveness of our system, we used a collection of noisy documents and we compared our results with those of a commercial optical character recognition (OCR) package.