Creating a manually error-tagged and shallow-parsed learner corpus

Authors:
Ryo Nagata;Edward Whittaker;Vera Sheinman
Affiliations:
Konan University, Okamoto, Kobe, Japan;The Japan Institute for Educational Measurement Inc., Kita-Aoyama, Tokyo, Japan;The Japan Institute for Educational Measurement Inc., Kita-Aoyama, Tokyo, Japan
Venue:
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Year:
2011

Citing 14
Cited 2

Some advances in transformation-based part of speech tagging

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
An unsupervised method for detecting grammatical errors

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Automatic error detection in the Japanese learners' English spoken data

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
Detecting errors in English article usage by non-native speakers

Natural Language Engineering
A feedback-augmented method for detecting errors in the writing of learners of English

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A classifier-based approach to preposition and determiner error correction in L2 English

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Detection of grammatical errors involving prepositions

SigSem '07 Proceedings of the Fourth ACL-SIGSEM Workshop on Prepositions
Automated Grammatical Error Detection for Language Learners

Automated Grammatical Error Detection for Language Learners
Training paradigms for correcting errors in grammar and usage

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Using parse features for preposition selection and error detection

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Annotating ESL errors: challenges and rewards

IUNLPBEA '10 Proceedings of the NAACL HLT 2010 Fifth Workshop on Innovative Use of NLP for Building Educational Applications
Rethinking grammatical error annotation and evaluation with the Amazon Mechanical Turk

IUNLPBEA '10 Proceedings of the NAACL HLT 2010 Fifth Workshop on Innovative Use of NLP for Building Educational Applications
Evaluating performance of grammatical error detection to maximize learning effect

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Detecting article errors based on the mass count distinction

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing

On using context for automatic correction of non-word misspellings in student essays

Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
A corpus of textual revisions in second language writing

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

The availability of learner corpora, especially those which have been manually error-tagged or shallow-parsed, is still limited. This means that researchers do not have a common development and test set for natural language processing of learner English such as for grammatical error detection. Given this background, we created a novel learner corpus that was manually error-tagged and shallow-parsed. This corpus is available for research and educational purposes on the web. In this paper, we describe it in detail together with its data-collection method and annotation schemes. Another contribution of this paper is that we take the first step toward evaluating the performance of existing POS-tagging/chunking techniques on learner corpora using the created corpus. These contributions will facilitate further research in related areas such as grammatical error detection and automated essay scoring.