Evaluation of automatically reformulated questions in question series

Authors:
Richard Shaw;Ben Solway;Robert Gaizauskas;Mark A. Greenwood
Affiliations:
University of Sheffield, Regent Court, Sheffield, UK;University of Sheffield, Regent Court, Sheffield, UK;University of Sheffield, Regent Court, Sheffield, UK;University of Sheffield, Regent Court, Sheffield, UK
Venue:
IRQA '08 Coling 2008: Proceedings of the 2nd workshop on Information Retrieval for Question Answering
Year:
2008

Citing 2
Cited 1

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Meteor: an automatic metric for MT evaluation with high levels of correlation with human judgments

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation

Benchmarking short text semantic similarity

International Journal of Intelligent Information and Database Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Having gold standards allows us to evaluate new methods and approaches against a common benchmark. In this paper we describe a set of gold standard question reformulations and associated reformulation guidelines that we have created to support research into automatic interpretation of questions in TREC question series, where questions may refer anaphorically to the target of the series or to answers to previous questions. We also assess various string comparison metrics for their utility as evaluation measures of the proximity of an automated system's reformulations to the gold standard. Finally we show how we have used this approach to assess the question processing capability of our own QA system and to pinpoint areas for improvement.