Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
Automatic learning of word transducers from examples
EACL '91 Proceedings of the fifth conference on European chapter of the Association for Computational Linguistics
A hybrid approach to fuzzy name search incorporating language-based and text-based principles
Journal of Information Science
Country wise classification of human names
AIKED'06 Proceedings of the 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases
A computational approach to the variations in Arabic verbal orthography
Computer Speech and Language
Hi-index | 0.00 |
This paper describes enhancements made to techniques currently used to search large databases of proper names. Improvements included use of a Hidden Markov Model (HMM) statistical classifier to identify the likely linguistic provenance of a surname, and application of language-specific rules to generate plausible spelling variations of names. These two components were incorporated into a prototype front-end system driving existing name search procedures. HMM models and sets of linguistic rules were constructed for Farsi, Spanish and Vietnamese surnames and tested on a database of over 11,000 entries. Preliminary evaluation indicates improved retrieval of 20--30% as measured by number of correct items retrieved.