Eigenspace Method for Text Retrieval in Historical Document Images
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Omnilingual Segmentation-freeWord Spotting for Ancient Manuscripts Indexation
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Text search for medieval manuscript images
Pattern Recognition
Binary-image comparison with local-dissimilarity quantification
Pattern Recognition
Towards an omnilingual word retrieval system for ancient manuscripts
Pattern Recognition
Feature string-based intelligent information retrieval from Tamil document images
International Journal of Computer Applications in Technology
Evaluation of different feature sets in an OCR free method for word spotting in printed documents
Proceedings of the 2010 ACM Symposium on Applied Computing
A survey of keyword spotting techniques for printed document images
Artificial Intelligence Review
Automatic keyword extraction from historical document images
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Hi-index | 0.00 |
An approach to searching user-specified words/phrases in Chinese document images, without the requirements of layout analysis, is proposed in this paper. Bounding boxes of Chinese character images are fir st determined using connected component analysis. Next, a suitable character from the user-specified word/phrase is chosen as the initial character to search for a matching candidate in the document. Once a matched candidate is found, its adjacent characters in the horizontal and vertical directions are examined for matching with other corresponding characters in the user-specified word/phrase, subject to the constraints of positional relation and size similarity. The character matching is done in two stages. The coarse matching is carried out based on the stroke density features. A weighted Hausdorff distance (WHD) is proposed for the second matching phase. Experimental results show that the proposed method can effectively search the user-specified Chinese word/phrase from horizontal or vertical text lines of document images.