Context based spelling correction
Information Processing and Management: an International Journal
Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
A technique for computer detection and correction of spelling errors
Communications of the ACM
Automatic Rule Acquisition for Spelling Correction
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Combining Trigram-based and feature-based methods for context-sensitive spelling correction
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A spelling correction program based on a noisy channel model
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2
Pronunciation modeling for improved spelling correction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
An improved error model for noisy channel spelling correction
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Learning a spelling error model from search query logs
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Learning phrase-based spelling error models from clickthrough data
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Hashing-based approaches to spelling correction of personal names
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
A large scale ranker-based system for search query spelling correction
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
A fast and accurate method for approximate string search
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Why press backspace?: understanding user input behaviors in Chinese Pinyin input method
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
CHIME: an efficient error-tolerant Cinese pinyin input method
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
A generalized hidden Markov model with discriminative training for query spelling correction
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Autonomous self-assessment of autocorrections: exploring text message dialogues
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Google books n-gram corpus used as a grammar checker
EACL 2012 Proceedings of the Second Workshop on Computational Linguistics and Writing (CLW 2012): Linguistic and Cognitive Aspects of Document Creation and Document Engineering
On using context for automatic correction of non-word misspellings in student essays
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
A unified approach to transliteration-based text input with online spelling correction
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Fast multi-task learning for query spelling correction
Proceedings of the 21st ACM international conference on Information and knowledge management
Interactive and context-aware tag spell check and correction
Proceedings of the 21st ACM international conference on Information and knowledge management
Detection of semantic errors in Arabic texts
Artificial Intelligence
Towards mining informal online data to guide component-reuse decisions
Proceedings of the 16th International ACM Sigsoft symposium on Component-based software engineering
Speller performance prediction for query autocorrection
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Cross-lingual geo-parsing for non-structured data
Proceedings of the 7th Workshop on Geographic Information Retrieval
Normalization of informal text
Computer Speech and Language
Hi-index | 0.00 |
We have designed, implemented and evaluated an end-to-end system spellchecking and autocorrection system that does not require any manually annotated training data. The World Wide Web is used as a large noisy corpus from which we infer knowledge about misspellings and word usage. This is used to build an error model and an n-gram language model. A small secondary set of news texts with artificially inserted misspellings are used to tune confidence classifiers. Because no manual annotation is required, our system can easily be instantiated for new languages. When evaluated on human typed data with real misspellings in English and German, our web-based systems outperform baselines which use candidate corrections based on hand-curated dictionaries. Our system achieves 3.8% total error rate in English. We show similar improvements in preliminary results on artificial data for Russian and Arabic.