The kappa statistic: a second look
Computational Linguistics
Inter-coder agreement for computational linguistics
Computational Linguistics
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
HyTER: meaning-equivalent semantics for translation evaluation
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Towards a predicate-argument evaluation for MT
SSST-6 '12 Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation
Putting human assessments of machine translation systems in order
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Findings of the 2012 workshop on statistical machine translation
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Statistical machine translation enhancements through linguistic levels: A survey
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
The Workshop on Statistical Machine Translation (WMT) has become one of ACL's flagship workshops, held annually since 2006. In addition to soliciting papers from the research community, WMT also features a shared translation task for evaluating MT systems. This shared task is notable for having manual evaluation as its cornerstone. The Workshop's overview paper, playing a descriptive and administrative role, reports the main results of the evaluation without delving deep into analyzing those results. The aim of this paper is to investigate and explain some interesting idiosyncrasies in the reported results, which only become apparent when performing a more thorough analysis of the collected annotations. Our analysis sheds some light on how the reported results should (and should not) be interpreted, and also gives rise to some helpful recommendation for the organizers of WMT.