Identifying off-topic student essays without topic-specific training data

Authors:
D. Higgins;J. Burstein;Y. Attali
Affiliations:
Educational Testing Service, Rosedale Road, Princeton, NJ 08541, USA e-mail: dhiggins@ets.org;Educational Testing Service, Rosedale Road, Princeton, NJ 08541, USA e-mail: dhiggins@ets.org;Educational Testing Service, Rosedale Road, Princeton, NJ 08541, USA e-mail: dhiggins@ets.org
Venue:
Natural Language Engineering
Year:
2006

Citing 11
Cited 4

The DARPA TIPSTER project

ACM SIGIR Forum
The nature of statistical learning theory

The nature of statistical learning theory
Automatic essay grading using text categorization techniques

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A hybrid user model for news story classification

UM '99 Proceedings of the seventh international conference on User modeling
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Vector-Based Semantic Analysis Using Random Indexing for Cross-Lingual Query Expansion

CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Automated scoring using a hybrid feature identification technique

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Automated essay evaluation: the criterion online writing service

AI Magazine
Classifying free-text triage chief complaints into syndromic categories with natural languages processing

Artificial Intelligence in Medicine
A machine learning approach to building domain-specific search engines

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2

Opportunities for Natural Language Processing Research in Education

CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Off-topic essay detection using short prompt texts

IUNLPBEA '10 Proceedings of the NAACL HLT 2010 Fifth Workshop on Innovative Use of NLP for Building Educational Applications
Measuring the use of factual information in test-taker essays

Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Scoring spoken responses based on content accuracy

Proceedings of the Seventh Workshop on Building Educational Applications Using NLP

Quantified Score

Hi-index	0.00

Visualization

Abstract

Educational assessment applications, as well as other natural-language interfaces, need some mechanism for validating user responses. If the input provided to the system is infelicitous or uncooperative, the proper response may be to simply reject it, to route it to a bin for special processing, or to ask the user to modify the input. If problematic user input is instead handled as if it were the system's normal input, this may degrade users' confidence in the software, or suggest ways in which they might try to “game” the system. Our specific task in this domain is the identification of student essays which are “off-topic”, or not written to the test question topic. Identification of off-topic essays is of great importance for the commercial essay evaluation system CriterionSM. The previous methods used for this task required 200–300 human scored essays for training purposes. However, there are situations in which no essays are available for training, such as when users (teachers) wish to spontaneously write a new topic for their students. For these kinds of cases, we need a system that works reliably without training data. This paper describes an algorithm that detects when a student's essay is off-topic without requiring a set of topic-specific essays for training. This new system is comparable in performance to previous models which require topic-specific essays for training, and provides more detailed information about the way in which an essay diverges from the requested essay topic.