Informing determiner and preposition error correction with word clusters

Authors:
Adriane Boyd;Marion Zepf;Detmar Meurers
Affiliations:
Universität Tübingen;Universität Tübingen;Universität Tübingen
Venue:
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Year:
2012

Citing 11
Cited 0

Class-based n-gram models of natural language

Computational Linguistics
Learning the countability of English nouns from corpus data

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Memory-based learning for article generation

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Modeling Discriminative Global Inference

ICSC '07 Proceedings of the International Conference on Semantic Computing
Native judgments of non-native usage: experiments in preposition error detection

HumanJudge '08 Proceedings of the Workshop on Human Judgements in Computational Linguistics
Web-scale N-gram models for lexical disambiguation

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Exploring the data-driven prediction of prepositions in English

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
A new dataset and method for automatically grading ESOL texts

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Algorithm selection and model adaptation for ESL correction tasks

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Helping our own: the HOO 2011 pilot shared task

ENLG '11 Proceedings of the 13th European Workshop on Natural Language Generation
Data-driven correction of function words in non-native English

ENLG '11 Proceedings of the 13th European Workshop on Natural Language Generation

Quantified Score

Hi-index	0.00

Visualization

Abstract

We extend our n-gram-based data-driven prediction approach from the Helping Our Own (HOO) 2011 Shared Task (Boyd and Meurers, 2011) to identify determiner and preposition errors in non-native English essays from the Cambridge Learner Corpus FCE Dataset (Yannakoudakis et al., 2011) as part of the HOO 2012 Shared Task. Our system focuses on three error categories: missing determiner, incorrect determiner, and incorrect preposition. Approximately two-thirds of the errors annotated in HOO 2012 training and test data fall into these three categories. To improve our approach, we developed a missing determiner detector and incorporated word clustering (Brown et al., 1992) into the n-gram prediction approach.