Preliminary investigation of techniques for automated reading of unformatted text
Communications of the ACM
Digital Picture Processing
Merkmale für die Segmentation von Dokumenten zur automatischen Textverarbeitung
Modelle und Strukturen, DAGM Symposium
An Interactive System for Reading Unformatted Printed Text
IEEE Transactions on Computers
Multi-Dimensional Interval Algebra with Symmetry for Describing Block Layouts
GREC '99 Selected Papers from the Third International Workshop on Graphics Recognition, Recent Advances
Scan-to-XML: Using Software Component Algebra for Intelligent Document Generation
GREC '01 Selected Papers from the Fourth International Workshop on Graphics Recognition Algorithms and Applications
Symbolic Learning Techniques in Paper Document Processing
MLDM '99 Proceedings of the First International Workshop on Machine Learning and Data Mining in Pattern Recognition
Feature Approach for Printed Document Image Analysis
Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
The T-Recs Table Recognition and Analysis System
DAS '98 Selected Papers from the Third IAPR Workshop on Document Analysis Systems: Theory and Practice
Automatic Indexing of Newspaper Microfilm Images
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Word and Sentence Extraction Using Irregular Pyramid
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Text/Graphics Separation Revisited
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Statistical Analysis of Bibliographic Strings for Constructing an Integrated Document Space
ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
Document Skew Detection Using Minimum-Area Bounding Rectangle
ITCC '00 Proceedings of the The International Conference on Information Technology: Coding and Computing (ITCC'00)
An Approach to Extracting the Target Text Line from a Document Image Captured by a Pen Scanner
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Texture Feature Characterization for Logical Pre-labeling
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Text - Image Separation in Devanagari Documents
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Font Type Extraction and Character Prototyping Using Gabor Filters
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Arabic Newspaper Page Segmentation
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Analysis and Conversion of Documents
ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 2 - Volume 2
Zone Identification in the Printed Gujarati Text
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
A Comprehensive Image Processing Suite for Book Re-mastering
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
A color-based layout analysis to process censorship cards of film archives
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Engineering Applications of Artificial Intelligence
A Figure Image Processing System
Graphics Recognition. Recent Advances and New Opportunities
Character prototyping in document images using Gabor filters
SCIA'03 Proceedings of the 13th Scandinavian conference on Image analysis
Text versus non-text distinction in online handwritten documents
Proceedings of the 2010 ACM Symposium on Applied Computing
Context-aware and content-based dynamic Voronoi page segmentation
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Associating figures with descriptions for patent documents
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Text detection in images using sparse representation with discriminative dictionaries
Image and Vision Computing
XML based architectures for documents comparison, categorisation, and scrutinisation
International Journal of Data Analysis Techniques and Strategies
Automatic localization of page segmentation errors
Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
Text line segmentation for gray scale historical document images
Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
Recognition of passports using FCM-based RBF network
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
ISNN'06 Proceedings of the Third international conference on Advnaces in Neural Networks - Volume Part II
Recognition of passports using a hybrid intelligent system
ICIAR'05 Proceedings of the Second international conference on Image Analysis and Recognition
Applying preattentive visual guidance in document image analysis
IWICPAS'06 Proceedings of the 2006 Advances in Machine Vision, Image Processing, and Pattern Analysis international conference on Intelligent Computing in Pattern Analysis/Synthesis
Performance comparison of six algorithms for page segmentation
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Comprehensive document representation
Mathematical and Computer Modelling: An International Journal
A graph based approach for heterogeneous document segmentation
ICISP'12 Proceedings of the 5th international conference on Image and Signal Processing
Natural language inspired approach for handwritten text line detection in legacy documents
LaTeCH '12 Proceedings of the 6th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
Automatic localization and correction of line segmentation errors
Proceeding of the workshop on Document Analysis and Recognition
Hi-Fi HTML rendering of multi-format documents in DoMinUS
Proceedings of the 2013 ACM symposium on Document engineering
Texture feature evaluation for segmentation of historical document images
Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing
Hi-index | 0.00 |
This paper outlines the requirements and components for a proposed Document Analysis System, which assists a user in encoding printed documents for computer processing. Several critical functions have been investigated and the technical approaches are discussed. The first is the segmentation and classification of digitized printed documents into regions of text and images. A nonlinear, run-length smoothing algorithm has been used for this purpose. By using the regular features of text lines, a linear adaptive classification scheme discriminates text regions from others. The second technique studied is an adaptive approach to the recognition of the hundreds of font styles and sizes that can occur on printed documents. A preclassifier is constructed during the input process and used to speed up a well-known pattern-matching method for clustering characters from an arbitrary print source into a small sample of prototypes. Experimental results are included.