Amharic Character Recognition using a Fast Signature Based Algorithm

Authors:
John Cowell;Fiaz Hussain
Affiliations:
-;-
Venue:
IV '03 Proceedings of the Seventh International Conference on Information Visualization
Year:
2003

Citing 0
Cited 4

Classifying Amharic webnews

Information Retrieval
Classifying Amharic news text using self-organizing maps

Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
Automatic diacritic restoration for resource-scarce languages

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Structural and syntactic techniques for recognition of ethiopic characters

SSPR'06/SPR'06 Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Amharic language is the principal language ofover 20 million people mainly in Ethiopia. An extensiveliterature survey reveals no journal or conference paperson Amharic character recognition. The Amharic scripthas 33 basic characters each with seven orders giving231 distinct characters, not including numbers andpunctuation symbols. The characters are cursive but notconnected and unlike other cursive scripts do not usedots.This paper describes the Amharic script anddiscusses the difficulties of applying conventionalstructural and syntactic recognition processes. Twostatistical algorithms for identifying Amharic charactersare described. In both, the characters are normalised forboth size and orientation. The first compares thecharacter against a series of templates. The secondderives a characteristic signature from the character andcompares this against a set of signature templates. Thesignatures used are fifty times smaller than the originalcharacter and the recognition process is correspondingfaster but with some loss of accuracy. The statisticaltechniques described have been fully implemented and theresulting performance outlined.