Generating confusion sets for context-sensitive error correction

Authors:
Alla Rozovskaya;Dan Roth
Affiliations:
University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL
Venue:
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Year:
2010

Citing 19
Cited 14

Learning to resolve natural language ambiguities: a unified approach

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
A Winnow-Based Approach to Context-Sensitive Spelling Correction

Machine Learning - Special issue on natural language learning
Large Margin Classification Using the Perceptron Algorithm

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Scaling Up Context-Sensitive Text Correction

Proceedings of the Thirteenth Conference on Innovative Applications of Artificial Intelligence Conference
Scaling to very very large corpora for natural language disambiguation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Automatic error detection in the Japanese learners' English spoken data

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
Detecting errors in English article usage by non-native speakers

Natural Language Engineering
Modeling Discriminative Global Inference

ICSC '07 Proceedings of the International Conference on Semantic Computing
Memory-Based Context-Sensitive Spelling Correction at Web Scale

ICMLA '07 Proceedings of the Sixth International Conference on Machine Learning and Applications
The importance of syntactic parsing and inference in semantic role labeling

Computational Linguistics
A classifier-based approach to preposition and determiner error correction in L2 English

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
The ups and downs of preposition error detection in ESL writing

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Detection of grammatical errors involving prepositions

SigSem '07 Proceedings of the Fourth ACL-SIGSEM Workshop on Prepositions
Automatically acquiring models of preposition use

SigSem '07 Proceedings of the Fourth ACL-SIGSEM Workshop on Prepositions
Training paradigms for correcting errors in grammar and usage

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Using mostly native data to correct errors in learners' writing: a meta-classifier approach

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Using parse features for preposition selection and error detection

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Annotating ESL errors: challenges and rewards

IUNLPBEA '10 Proceedings of the NAACL HLT 2010 Fifth Workshop on Innovative Use of NLP for Building Educational Applications
Exploring the data-driven prediction of prepositions in English

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters

Grammatical error correction with alternating structure optimization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Algorithm selection and model adaptation for ESL correction tasks

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Why press backspace?: understanding user input behaviors in Chinese Pinyin input method

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
They can help: using crowdsourcing to improve the evaluation of grammatical error detection systems

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Selective block minimization for faster convergence of limited memory large-scale linear models

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
High-order sequence modeling for language learner error detection

IUNLPBEA '11 Proceedings of the 6th Workshop on Innovative Use of NLP for Building Educational Applications
Correcting semantic collocation errors with L1-induced paraphrases

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
University of Illinois system in HOO text correction shared task

ENLG '11 Proceedings of the 13th European Workshop on Natural Language Generation
"Could you make me a favour and do coffee, please?": implications for automatic error correction in English and Dutch

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
HOO 2012 error recognition and correction shared task: Cambridge University submission report

Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Korea University system in the HOO 2012 shared task

Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
The UI system in the HOO 2012 shared task on error correction

Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Grammar error correction using pseudo-error sentences and domain adaptation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Bucking the trend: improved evaluation and annotation practices for ESL error detection systems

Language Resources and Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we consider the problem of generating candidate corrections for the task of correcting errors in text. We focus on the task of correcting errors in preposition usage made by non-native English speakers, using discriminative classifiers. The standard approach to the problem assumes that the set of candidate corrections for a preposition consists of all preposition choices participating in the task. We determine likely preposition confusions using an annotated corpus of non-native text and use this knowledge to produce smaller sets of candidates. We propose several methods of restricting candidate sets. These methods exclude candidate prepositions that are not observed as valid corrections in the annotated corpus and take into account the likelihood of each preposition confusion in the non-native text. We find that restricting candidates to those that are observed in the non-native data improves both the precision and the recall compared to the approach that views all prepositions as possible candidates. Furthermore, the approach that takes into account the likelihood of each preposition confusion is shown to be the most effective.