Visual signature based identification of Low-resolution document images

Authors:
Ardhendu Behera;Denis Lalanne;Rolf Ingold
Affiliations:
Université de Fribourg, Fribourg;Université de Fribourg, Fribourg;Université de Fribourg, Fribourg
Venue:
Proceedings of the 2004 ACM symposium on Document engineering
Year:
2004

Citing 12
Cited 5

Classification of newspaper image blocks using texture analysis

Computer Vision, Graphics, and Image Processing
Text segmentation using Gabor filters for automatic document processing

Machine Vision and Applications - Special issue: document image analysis techniques
Page segmentation and classification

CVGIP: Graphical Models and Image Processing
Teaching and learning as multimedia authoring: the classroom 2000 project

MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
A Fast Algorithm for Bottom-Up Document Layout Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Passive capture and structuring of lectures

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
Jabberwocky: you don't have to be a rocket scientist to change slides for a hydrogen combustion lecture

Proceedings of the 5th international conference on Intelligent user interfaces
Automatically linking multimedia meeting documents by image matching

HYPERTEXT '00 Proceedings of the eleventh ACM on Hypertext and hypermedia
Syntactic Segmentation and Labeling of Digitized Pages from Technical Journals

IEEE Transactions on Pattern Analysis and Machine Intelligence
Page segmentation and classification utilising a bottom-up approach

ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Xed: A New Tool for eXtracting Hidden Structures from Electronic Documents

DIAL '04 Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL'04)
Page segmentation using texture analysis

Pattern Recognition

Influence of fusion strategies on feature-based identification of low-resolution documents

Proceedings of the 2005 ACM symposium on Document engineering
Enhancement of Layout-based Identification of Low-resolution Documents using Geometrical Color Distribution

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
DocMIR: An automatic document-based indexing system for meeting retrieval

Multimedia Tools and Applications
PaperComp 2010: first international workshop on paper computing

Proceedings of the 12th ACM international conference adjunct papers on Ubiquitous computing - Adjunct
Using static documents as structured and thematic interfaces to multimedia meeting archives

MLMI'04 Proceedings of the First international conference on Machine Learning for Multimodal Interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present (a) a method for identifying documents captured from low-resolution devices such as web-cams, digital cameras or mobile phones and (b) a technique for extracting their textual content without performing OCR. The first method associates a hierarchically structured visual signature to the low-resolution document image and further matches it with the visual signatures of the original high-resolution document images, stored in PDF form in a repository. The matching algorithm follows the signature hierarchy, which speeds-up the search by guiding it towards fruitful solution spaces. In a second step, the content of the original PDF document is extracted, structured, and matched with its corresponding high-resolution visual signature. Finally, the matched content is attached to the low-resolution document image's visual signature, which greatly enriches the document's content and indexing. We present in this article both these identification and extraction methods and evaluate them on various documents, resolutions and lighting conditions, using different capture devices.