Page Classification for Meta-data Extraction from Digital Collections
DEXA '01 Proceedings of the 12th International Conference on Database and Expert Systems Applications
Document Image Retrieval Based on 2D Density Distributions of Terms with Pseudo Relevance Feedback
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
A Fast Multifunctional Approach for Document Image Analysis
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Document Image Recognition Based on Template Matching of Component Block Projections
IEEE Transactions on Pattern Analysis and Machine Intelligence
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Proceedings of the 1st ACM workshop on Hardcopy document processing
Artificial Neural Networks for Document Analysis and Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Ontology Guided Access to Document Images
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Document Ranking by Layout Relevance
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Applying natural language generation to indicative summarization
EWNLG '01 Proceedings of the 8th European workshop on Natural Language Generation - Volume 8
Characteristics of document similarity measures for compliance analysis
Proceedings of the 18th ACM conference on Information and knowledge management
Robust image based document comparison using attributed relational graphs
SPPRA '08 Proceedings of the Fifth IASTED International Conference on Signal Processing, Pattern Recognition and Applications
Retrieval of document images based on page layout similarity
AMR'06 Proceedings of the 4th international conference on Adaptive multimedia retrieval: user, context, and feedback
Advanced paper document in a projection display
PCM'04 Proceedings of the 5th Pacific Rim Conference on Advances in Multimedia Information Processing - Volume Part II
Exploratory analysis system for semi-structured engineering logs
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Hi-index | 0.00 |
This paper describes features and methods for document image comparison and classification at the spatial layout level. The methods are useful for visual similarity based document retrieval as well as fast algorithms for initial document type classification without OCR. A novel feature set called interval encoding is introduced to capture elements of spatial layout. This feature set encodes region layout information in fixed-length vectors which can be used for fast page layout comparison.The paper describes experiments and results to rank-order a set of document pages in terms of their layout similarity to a test document. We also demonstrate the usefulness of the features derived from interval encoding in a hidden Markov model based page layout classification system that is trainable and extendible.