MAISE: a flexible, configurable, extensible open source package for mass AI system evaluation

Authors:
Omar F. Zaidan
Affiliations:
Johns Hopkins University, Baltimore, MD
Venue:
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Year:
2011

Citing 1
Cited 0

Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

Quantified Score

Hi-index	0.00

Visualization

Abstract

The past few years have seen an increasing interest in using Amazon's Mechanical Turk for purposes of collecting data and performing annotation tasks. One such task is the mass evaluation of system output in a variety of tasks. In this paper, we present MAISE, a package that allows researchers to evaluate the output of their AI system(s) using human judgments collected via Amazon's Mechanical Turk, greatly streamlining the process. MAISE is open source, easy to run, and platform-independent. The core of MAISE's codebase was used for the manual evaluation of WMT10, and the completed package is being used again in the current evaluation for WMT11. In this paper, we describe the main features, functionality, and usage of MAISE, which is now available for download and use.