Fast text searching: allowing errors
Communications of the ACM
Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
IEEE Transactions on Pattern Analysis and Machine Intelligence
The String-to-String Correction Problem
Journal of the ACM (JACM)
A technique for computer detection and correction of spelling errors
Communications of the ACM
A spelling correction program based on a noisy channel model
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2
Pronunciation modeling for improved spelling correction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
An improved error model for noisy channel spelling correction
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Exploring distributional similarity based models for query spelling correction
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A unified and discriminative model for query refinement
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Unsupervised learning of multilingual short message service (SMS) dialect from noisy examples
Proceedings of the second workshop on Analytics for noisy unstructured text data
Analysis of long queries in a large scale search log
Proceedings of the 2009 workshop on Web Search Click Data
A survey of types of text noise and techniques to handle noisy text
Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
A discriminative candidate generator for string transformations
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Japanese query alteration based on semantic similarity
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Effective spelling correction in web queries and run-time DB construction
Proceedings of the 2009 International Conference on Hybrid Information Technology
Using the web for language independent spellchecking and autocorrection
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Discovery of term variation in Japanese web search queries
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Query reformulation using anchor text
Proceedings of the third ACM international conference on Web search and data mining
Mining Query Logs: Turning Search Usage Data into Knowledge
Foundations and Trends in Information Retrieval
Improving the multilingual user experience of Wikipedia using cross-language name search
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Learning phrase-based spelling error models from clickthrough data
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Hashing-based approaches to spelling correction of personal names
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
A large scale ranker-based system for search query spelling correction
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Paraphrasing with search engine query logs
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
A fast and accurate method for approximate string search
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
A graph approach to spelling correction in domain-centric search
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Latent class transliteration based on source language origin
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Why press backspace?: understanding user input behaviors in Chinese Pinyin input method
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
CrowdLogging: distributed, private, and anonymous search logging
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Characterizing web syndication behavior and content
WISE'11 Proceedings of the 12th international conference on Web information system engineering
A generalized hidden Markov model with discriminative training for query spelling correction
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
An analysis of free-text queries for a multi-field web form
Proceedings of the 4th Information Interaction in Context Symposium
A discriminative model for query spelling correction with latent structural SVM
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Latent semantic transliteration using dirichlet mixture
NEWS '12 Proceedings of the 4th Named Entity Workshop
Fast multi-task learning for query spelling correction
Proceedings of the 21st ACM international conference on Information and knowledge management
Domain dependent query reformulation for web search
Proceedings of the 21st ACM international conference on Information and knowledge management
G-WSTD: a framework for geographic web search topic discovery
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Applying the noisy channel model to search query spelling correction requires an error model and a language model. Typically, the error model relies on a weighted string edit distance measure. The weights can be learned from pairs of misspelled words and their corrections. This paper investigates using the Expectation Maximization algorithm to learn edit distance weights directly from search query logs, without relying on a corpus of paired words.