The indexing and retrieval of document images: a survey
Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
Summarization of imaged documents without OCR
Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
Compression of scan-digitized Indian language printed text: a soft pattern matching technique
Proceedings of the 2003 ACM symposium on Document engineering
Using summaries in document retrieval
AS '02 Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4
Summarization of JBIG2 Compressed Indian Language Textual Images
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 03
YASS: Yet another suffix stripper
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
Automatic summarization of JBIG2 coded textual images is discussed. Compressed images are partially decompressed to compute relevant features. The feature extraction method is free from using any character recognition module. Summary sentences are ranked. Experiment considers documents in Indic scripts that lack in having any efficient OCR systems. Script independent aspect of the approach is highlighted through use of two most popular Indic scripts. Sentence selection efficiency of about 61% is achieved when judged against man-made summarization. A nonparametric (distribution-free) rank statistic shows a correlation coefficient of 0.33 as a measure of the (minimum) strength of the associations between sentence ranking by machine and human.