A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
The indexing and retrieval of document images: a survey
Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
Text/Graphics Separation in Maps
GREC '01 Selected Papers from the Fourth International Workshop on Graphics Recognition Algorithms and Applications
Text/Graphics Separation Revisited
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1
Alignment of Free Layout Color Texts for Character Recognition
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Document Filtering for Fast Approximate String Matching of Errorneous Text
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Graphics Recognition - from Re-engineering to Retrieval
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Indexing and retrieval of words in old documents
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Recognition of Rotated Characters by Eigen-space
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Multi-Oriented English Text Line Extraction Using Background and Foreground Information
DAS '08 Proceedings of the 2008 The Eighth IAPR International Workshop on Document Analysis Systems
Multi-Oriented and Multi-Sized Touching Character Segmentation Using Dynamic Programming
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Text Segmentation in Colour Posters from the Spanish Civil War Era
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Hi-index | 0.00 |
In this paper, we present an approach towards the retrieval of words from graphical document images. In graphical documents, due to presence of multi-oriented characters in non-structured layout, word indexing is a challenging task. The proposed approach uses recognition results of individual components to form character pairs with the neighboring components. An indexing scheme is designed to store the spatial description of components and to access them efficiently. Given a query text word (ascii/unicode format), the character pairs present in it are searched in the document. Next the retrieved character pairs are linked sequentially to form character string. Dynamic programming is applied to find different instances of query words. A string edit distance is used here to match the query word as the objective function. Recognition of multi-scale and multi-oriented character component is done using Support Vector Machine classifier. To consider multi-oriented character strings the features used in the SVM are invariant to character orientation. Experimental results show that the method is efficient to locate a query word from multi-oriented text in graphical documents.