Word Spotting: A New Approach to Indexing Handwriting
CVPR '96 Proceedings of the 1996 Conference on Computer Vision and Pattern Recognition (CVPR '96)
Word Spotting in Chinese Document Images without Layout Analysis
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 3 - Volume 3
Features for Word Spotting in Historical Manuscripts
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Indexing and retrieval of words in old documents
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
A Segmentation-free Approach for Keyword Search in Historical Typewritten Documents
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Eigenspace Method for Text Retrieval in Historical Document Images
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
On Appearance-Based Feature Extraction Methods for Writer-Independent Handwritten Text Recognition
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Journal of Cognitive Neuroscience
Document image analysis for active reading
SADPI '07 Proceedings of the 2007 international workshop on Semantically aware document processing and indexing
A line-based representation for matching words in historical manuscripts
Pattern Recognition Letters
Hi-index | 0.00 |
This paper presents an automatic keyword extraction method from historical document images. The proposed method is language independent because it is purely appearance based, where neither lexical information nor any other statistical language models are required. Moreover, since it does not need word segmentation, it can be applied to Eastern languages where they do not put clear spacing between words. The first half of the paper describes the algorithm to retrieve document image regions which have similar appearance to the given query image. The algorithm was evaluated in recall-precision manner, and showed its performance of over 80–90% average precision. The second half of the paper describes the keyword extraction method which works even if no query word is explicitly specified. Since the computational cost was reduced by the efficient pruning techniques, the system could extract keywords successfully from relatively large documents.