A Survey of Methods and Strategies in Character Segmentation

Authors:
Richard G. Casey;Eric Lecolinet
Affiliations:
-;-
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
1996

Citing 9
Cited 100

Recognition of isolated and simply connected hand-written numerals

Pattern Recognition
Off-Line Cursive Script Word Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
A review of segmentation and contextual analysis techniques for text recognition

Pattern Recognition
An algorithm for segmenting handwritten postal codes

International Journal of Man-Machine Studies
The State of the Art in Online Handwriting Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Recognition of handwritten word: first and second order hidden Markov model based approach

Pattern Recognition
A word shape analysis approach to lexicon based word recognition

Pattern Recognition Letters
Original Contribution: Recognition and segmentation of connected characters with selective attention

Neural Networks
Hybrid Contextural Text Recognition with String Matching

IEEE Transactions on Pattern Analysis and Machine Intelligence

Twenty Years of Document Image Analysis in PAMI

IEEE Transactions on Pattern Analysis and Machine Intelligence
On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Role of Holistic Paradigms in Handwritten Word Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Statistical Approach for Phrase Location and Recognition within a Text Line: An Application to Street Name Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Multilingual machine printed OCR

Hidden Markov models
Offline General Handwritten Word Recognition Using an Approximate BEAM Matching Algorithm

IEEE Transactions on Pattern Analysis and Machine Intelligence
Using the Gamera framework for the recognition of cultural heritage materials

Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Recognition of Handwritten ZIP Codes in a Real—WorldNon-Standard-Letter Sorting System

Applied Intelligence
Bayesian object identification: variants

Journal of Multivariate Analysis
Restoration of Archival Documents Using a Wavelet Technique

IEEE Transactions on Pattern Analysis and Machine Intelligence
Lexicon-Driven Segmentation and Recognition of Handwritten Character Strings for Japanese Address Reading

IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust watermarking of cartographic images

EURASIP Journal on Applied Signal Processing - Emerging applications of multimedia data hiding
Touching numeral segmentation using water reservoir concept

Pattern Recognition Letters
A Model of Unconstrained Digit Recognition Based on Hypothesis Testing and Data Reconstruction

AI '01 Proceedings of the 14th Australian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Scale Space Technique for Word Segmentation in Handwritten Documents

SCALE-SPACE '99 Proceedings of the Second International Conference on Scale-Space Theories in Computer Vision
Wavelet Applications in Segmentation of Handwriting in Archival Documents

WAA '01 Proceedings of the Second International Conference on Wavelet Analysis and Its Applications
Chinese Handwritten Character Segmentation in Form Documents

DAS '98 Selected Papers from the Third IAPR Workshop on Document Analysis Systems: Theory and Practice
Lexical Search Approach for Character-String Recognition

DAS '98 Selected Papers from the Third IAPR Workshop on Document Analysis Systems: Theory and Practice
A General Approach to Quality Evaluation of Document Segmentation Results

DAS '98 Selected Papers from the Third IAPR Workshop on Document Analysis Systems: Theory and Practice
Word and Sentence Extraction Using Irregular Pyramid

DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
A Technique for Segmentation of Gurmukhi Text

CAIP '01 Proceedings of the 9th International Conference on Computer Analysis of Images and Patterns
On OCR of Degraded Documents Using Fuzzy Multifactorial Analysis

AFSS '02 Proceedings of the 2002 AFSS International Conference on Fuzzy Systems. Calcutta: Advances in Soft Computing
Word recognition system using neural networks

Highly parallel computaions
Improvement of Matching and Evaluation in Handwritten Numeral Recognition Using Flexible Standard Patterns

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Recognition of Cursive Roman Handwriting - Past, Present and Future

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Improving Chinese/English OCR Performance by Using MCE-based Character-Pair Modeling and Negative Training

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Numeral recognition for quality control of surgical sachets

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Detection and Segmentation of Touching Characters in Mathematical Expressions

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Segmentation of Bangla Unconstrained Handwritten Text

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
A Contour Code Feature Based Segmentation For Handwriting Recognition

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Rejection Algorithm for Mis-segmented Characters In Multilingual Document Recognition

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
INFTY: an integrated OCR system for mathematical documents

Proceedings of the 2003 ACM symposium on Document engineering
Segmentation of Low-Quality Typewritten Digits

ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 2 - Volume 2
Effects of Classifier Structures and Training Regimes on Integrated Segmentation and Recognition of Handwritten Numeral Strings

IEEE Transactions on Pattern Analysis and Machine Intelligence
Style Context with Second-Order Statistics

IEEE Transactions on Pattern Analysis and Machine Intelligence
Artificial Neural Networks for Document Analysis and Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Scale Space Approach for Automatically Segmenting Words from Historical Handwritten Documents

IEEE Transactions on Pattern Analysis and Machine Intelligence
A two-stage handwritten character segmentation approach in mail address recognition

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Recognition of Indian Multi-oriented and Curved Text

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Segmentation of Touching Symbols for OCR of Printed Mathematical Expressions: An Approach based on Multifactorial Analysis

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
A Hidden Markov Model Based Segmentation and Recognition Algorithm for Chinese Handwritten Address Character Strings

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Segmentation of Connected Chinese Characters Based on Genetic Algorithm

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
A Two-stage Online Handwritten Chinese Character Segmentation Algorithm Based on Dynamic Programming

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Camera-based Degraded Character Segmentation into Individual Components

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Design of a Chinese Name Card Understanding System

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Parameter estimation under ambiguity and contamination with the spurious model

Journal of Multivariate Analysis
Font Adaptive Word Indexing of Modern Printed Documents

IEEE Transactions on Pattern Analysis and Machine Intelligence
Using sparse pixel character vectorisation for optical character recognition

ACST'06 Proceedings of the 2nd IASTED international conference on Advances in computer science and technology
A genetic framework using contextual knowledge for segmentation and recognition of handwritten numeral strings

Pattern Recognition
Class-specific feature polynomial classifier for pattern classification and its application to handwritten numeral recognition

Pattern Recognition
Saliency and semantic processing: Extracting forest cover from historical topographic maps

Pattern Recognition
Holistic cursive word recognition based on perceptual features

Pattern Recognition Letters
Robust watermarking of cartographic images

EURASIP Journal on Applied Signal Processing
Automatic segmentation of metaphase cells based on global context and variant analysis

Pattern Recognition
Mosaicing-by-recognition for video-based text recognition

Pattern Recognition
Filtering segmentation cuts for digit string recognition

Pattern Recognition
A system for processing handwritten bank checks automatically

Image and Vision Computing
Segmentation of overlapping cursive handwritten digits

Proceedings of the eighth ACM symposium on Document engineering
Off-line recognition of realistic Chinese handwriting using segmentation-free strategy

Pattern Recognition
Real-Time Car License Plate Recognition Improvement Based on Spatiognitron Neural Network

IEA/AIE '08 Proceedings of the 21st international conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: New Frontiers in Applied Artificial Intelligence
Character segmentation and recognition algorithm of text region in steel images

ISPRA'09 Proceedings of the 8th WSEAS international conference on Signal processing, robotics and automation
Digit extraction and recognition from machine printed Gurmukhi documents

Proceedings of the International Workshop on Multilingual OCR
A method for combining complementary techniques for document image segmentation

Pattern Recognition
A method for combining complementary techniques for document image segmentation

Pattern Recognition
FPGA-Based Vocabulary Recognition Module for Humanoid Robot

Proceedings of the FIRA RoboWorld Congress 2009 on Advances in Robotics
Ottoman archives explorer: A retrieval system for digital Ottoman archives

Journal on Computing and Cultural Heritage (JOCCH)
Handwriting segmentation of Arabic text

SPPRA '08 Proceedings of the Fifth IASTED International Conference on Signal Processing, Pattern Recognition and Applications
Simulating inertial and centripetal forces for segmentation of overlapped handwritten digits

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Segmentation of connected handwritten chinese characters based on stroke analysis and background thinning

PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
FyFont: find-your-font in large font databases

SCIA'07 Proceedings of the 15th Scandinavian conference on Image analysis
A multiple classifier approach for the recognition of screen-rendered text

CAIP'07 Proceedings of the 12th international conference on Computer analysis of images and patterns
Logical DP matching for detecting similar subsequence

ACCV'07 Proceedings of the 8th Asian conference on Computer vision - Volume Part I
Recognition of isolated handwritten Kannada numerals based on image fusion method

PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
Segmentation-driven offline handwritten Chinese and Arabic script recognition

SACH'06 Proceedings of the 2006 conference on Arabic and Chinese handwriting recognition
Multi-lingual offline handwriting recognition using hidden Markov models: a script-independent approach

SACH'06 Proceedings of the 2006 conference on Arabic and Chinese handwriting recognition
Dynamic masking of application displays using OCR technologies

IBM Journal of Research and Development
Multi-oriented Bangla and Devnagari text recognition

Pattern Recognition
Word spotting in the wild

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Performance analysis of character segmentation approach for cursive script recognition on benchmark database

Digital Signal Processing
Segment confidence-based binary segmentation (SCBS) for cursive handwritten words

Expert Systems with Applications: An International Journal
A comprehensive neural-based approach for text recognition in videos using natural language processing

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Identifying Join Candidates in the Cairo Genizah

International Journal of Computer Vision
Character segmentation from ancient palm leaf manuscripts in Thailand

Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
An improved contour-based thinning method for character images

Pattern Recognition Letters
Context driven chinese string segmentation and recognition

SSPR'06/SPR'06 Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Automated book reader for persons with blindness

ICCHP'06 Proceedings of the 10th international conference on Computers Helping People with Special Needs
A restoration and segmentation unit for the historic persian documents

ACIVS'05 Proceedings of the 7th international conference on Advanced Concepts for Intelligent Vision Systems
Electronic reading pen: a DSP based portable device for offline OCR and bi-linguistic translation

ICESS'04 Proceedings of the First international conference on Embedded Software and Systems
An efficient feature extraction method for the middle-age character recognition

ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part II
Off-line cursive script recognition: current advances, comparisons and remaining problems

Artificial Intelligence Review
Assessing handwitten digit segmentation algorithms

Proceedings of the 27th Annual ACM Symposium on Applied Computing
A synthesised word approach to word retrieval in handwritten documents

Pattern Recognition
Segmenting web-domains and hashtags using length specific models

Proceedings of the 21st ACM international conference on Information and knowledge management
Text recognition in videos using a recurrent connectionist approach

ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
Segmentation of Bangla words in scene images

Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
Line segmentation of handwritten Gurmukhi manuscripts

Proceeding of the workshop on Document Analysis and Recognition
Evaluating glyph binarizations based on their properties

Proceedings of the 2013 ACM symposium on Document engineering
Segmentation of connected handwritten digits using Self-Organizing Maps

Expert Systems with Applications: An International Journal
Methodologies for recognition of old Slavic Cyrillic characters

International Journal of Computational Intelligence Studies
A new thresholding algorithm for document images based on the perception of objects by distance

Integrated Computer-Aided Engineering

Quantified Score

Hi-index	0.15

Visualization

Abstract

Character segmentation has long been a critical area of the OCR process. The higher recognition rates for isolated characters vs. those obtained for words and connected character strings well illustrate this fact. A good part of recent progress in reading unconstrained printed and written text may be ascribed to more insightful handling of segmentation.This paper provides a review of these advances. The aim is to provide an appreciation for the range of techniques that have been developed, rather than to simply list sources. Segmentation methods are listed under four main headings. What may be termed the "classical" approach consists of methods that partition the input image into subimages, which are then classified. The operation of attempting to decompose the image into classifiable units is called "dissection." The second class of methods avoids dissection, and segments the image either explicitly, by classification of prespecified windows, or implicitly by classification of subsets of spatial features collected from the image as a whole. The third strategy is a hybrid of the first two, employing dissection together with recombination rules to define potential segments, but using classification to select from the range of admissible segmentation possibilities offered by these subimages. Finally, holistic approaches that avoid segmentation by recognizing entire character strings as units are described.