High-order sequence modeling for language learner error detection

Authors:
Michael Gamon
Affiliations:
Microsoft Research, Redmond, WA
Venue:
IUNLPBEA '11 Proceedings of the 6th Workshop on Innovative Use of NLP for Building Educational Applications
Year:
2011

Citing 17
Cited 3

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
An unsupervised method for detecting grammatical errors

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
How to detect grammatical errors in a text without parsing it

EACL '87 Proceedings of the third conference on European chapter of the Association for Computational Linguistics
Feature-rich part-of-speech tagging with a cyclic dependency network

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Learning accurate, compact, and interpretable tree annotation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A classifier-based approach to preposition and determiner error correction in L2 English

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
The ups and downs of preposition error detection in ESL writing

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
NER systems that suit user's preferences: adjusting the recall-precision trade-off for entity extraction

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Detection of grammatical errors involving prepositions

SigSem '07 Proceedings of the Fourth ACL-SIGSEM Workshop on Prepositions
Automated Grammatical Error Detection for Language Learners

Automated Grammatical Error Detection for Language Learners
An overview of Microsoft web N-gram corpus and applications

HLT-DEMO '10 Proceedings of the NAACL HLT 2010 Demonstration Session
Training paradigms for correcting errors in grammar and usage

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Using mostly native data to correct errors in learners' writing: a meta-classifier approach

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Generating confusion sets for context-sensitive error correction

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Automated whole sentence grammar correction using a noisy channel model

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1

Correcting comma errors in learner essays, and restoring commas in newswire text

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Correction detection and error type selection as an ESL educational aid

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Evidence in automatic error correction improves learners' english skill

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address the problem of detecting English language learner errors by using a discriminative high-order sequence model. Unlike most work in error-detection, this method is agnostic as to specific error types, thus potentially allowing for higher recall across different error types. The approach integrates features from many sources into the error-detection model, ranging from language model-based features to linguistic analysis features. Evaluation results on a large annotated corpus of learner writing indicate the feasibility of our approach on a realistic, noisy and inherently skewed set of data. High-order models consistently outperform low-order models in our experiments. Error analysis on the output shows that the calculation of precision on the test set represents a lower bound on the real system performance.