Word Discrimination Based on Bigram Co-Occurrences

Authors:
Affiliations:
Venue:
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Year:
2001

Citing 0
Cited 2

Handwriting Recognition Using Position Sensitive Letter N-Gram Matching

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
CodeSlinger: An Interactive Biomedical Ontology Browser

AIME '09 Proceedings of the 12th Conference on Artificial Intelligence in Medicine: Artificial Intelligence in Medicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Abstract: Very few pairs of English words share exactly the same letter bigrams. This linguistic property can be exploited to bring lexical context into the classification stage of a word recognition system. The lexical n-gram matches between every word in a lexicon and a subset of reference words can be precomputed. If a match function can detect matching segments of at least n-gram length from the feature representation of words, then an unknown word can be recognized by determining the subset of reference words having an n-gram match at the feature level with the unknown word. We show that with a reasonable number of reference words, bigrams represent the best compromise between the recall ability of single letters and the precision of trigrams. Our simulations indicate that using a longer reference list can compensate errors in feature extraction. The algorithm is fast enough, even with a slow processor, for human-computer interaction.