Method for evaluation of stemming algorithms based on error counting
Journal of the American Society for Information Science
Phonetic string matching: lessons from information retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Applications of approximate word matching in information retrieval
CIKM '97 Proceedings of the sixth international conference on Information and knowledge management
A Winnow-Based Approach to Context-Sensitive Spelling Correction
Machine Learning - Special issue on natural language learning
An Algorithm that Learns What‘s in a Name
Machine Learning - Special issue on natural language learning
Spelling correction in user interfaces
Communications of the ACM
Advances in phonetic word spotting
Proceedings of the tenth international conference on Information and knowledge management
A systematic comparison of various statistical alignment models
Computational Linguistics
Transliteration of proper names in cross-language applications
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Statistical transliteration for english-arabic cross language information retrieval
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
A spelling correction program based on a noisy channel model
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2
The text retrieval conferences (TRECS)
TIPSTER '98 Proceedings of a workshop on held at Baltimore, Maryland: October 13-15, 1998
Using Soundex codes for indexing names in ASR documents
SpeechIR '04 Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004
When Harry met Harri: cross-lingual name spelling normalization
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Phonetic models for generating spelling variants
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Bootstrapped named entity recognition for product attribute extraction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.01 |
Many proper names are spelled inconsistently in speech recognizer output, posing a problem for applications where locating mentions of named entities is critical. We model the distortion in the spelling of a name due to the speech recognizer as the effect of a noisy channel. The models follow the framework of the IBM translation models. The model is trained using a parallel text of closed caption and automatic speech recognition output. We also test a string edit distance based method. The effectiveness of these models is evaluated on a name query retrieval task. Our methods result in a 60% improvement in F1. We also demonstrate why the problem has not been critical in TREC and TDT tasks.