Structural handwritten and machine print classification for sparse content and arbitrary oriented document fragments

Authors:
Sukalpa Chanda;Katrin Franke;Umapada Pal
Affiliations:
Gjøvik University College, Gjøvik, Norway;Gjøvik University College, Gjøvik, Norway;Indian Statistical Institute, Kolkata, India
Venue:
Proceedings of the 2010 ACM Symposium on Applied Computing
Year:
2010

Citing 14
Cited 0

Character recognition—a review

Pattern Recognition
The nature of statistical learning theory

The nature of statistical learning theory
Machine-printed and hand-written text lines identification

Pattern Recognition Letters
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
The Document Spectrum for Page Layout Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
A system for machine-written and hand-written character distinction

ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Separating Handwritten Material from Machine Printed Text Using Hidden Markov Models

ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Machine Printed Text and Handwriting Identification in Noisy Document Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
Classification of Machine-Printed and Handwritten Addresses on Korean Mail Piece Images Using Geometric Features

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 2 - Volume 02
A straight line detection using principal component analysis

Pattern Recognition Letters
Iterated Document Content Classification

ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 01
Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms

IEEE Transactions on Pattern Analysis and Machine Intelligence
Signature Detection and Matching for Document Image Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
A robust two level classification algorithm for text localization in documents

ISVC'07 Proceedings of the 3rd international conference on Advances in visual computing - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Discriminating handwritten and printed text is a challenging task in an arbitrary orientation scenario. The task gets even tougher when the text content is by nature sparse in the document, e.g. in torn document pieces. We here propose a system for discriminating handwritten and printed text in the context of sparse data and arbitrary orientation. A chain-code feature is used with Support Vector Machine (SVM) classifier for the purpose. Prior to feature extraction and classification some preprocessing steps (like region growing and angle estimation using Principle Component Analysis) are performed in order to resolve the arbitrary orientation issue. We got promising results of 96.90% accuracy, even when the document consists of sparse data with arbitrary orientation.