Evaluating Natural Language Processing Systems: An Analysis and Review
Evaluating Natural Language Processing Systems: An Analysis and Review
The Proper Place of Men and Machines inLanguage Translation
Machine Translation
Lexical cohesion computed by thesaural relations as an indicator of the structure of text
Computational Linguistics
Using test suites in evaluation of machine translation systems
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2
Language and Machines: Computers in Translation and Linguistics
Language and Machines: Computers in Translation and Linguistics
Words are not enough: sentence level natural language watermarking
Proceedings of the 4th ACM international workshop on Contents protection and security
Statistical machine translation
ACM Computing Surveys (CSUR)
Regression for machine translation evaluation at the sentence level
Machine Translation
Human judgment as a parameter in evaluation campaigns
HumanJudge '08 Proceedings of the Workshop on Human Judgements in Computational Linguistics
A structural similarity measure
LD '06 Proceedings of the Workshop on Linguistic Distances
Qualitative evaluation of automatically calculated acception based MLDB
MLR '04 Proceedings of the Workshop on Multilingual Linguistic Ressources
Blast: a tool for error analysis of machine translation output
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Systems Demonstrations
Automatic and human evaluation on english-croatian legislative test set
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
Hi-index | 0.00 |
This article defines a Framework for Machine TranslationEvaluation (FEMTI) which relates the quality model used to evaluatea machine translation system to the purpose and context of thesystem. Our proposal attempts to put together, into a coherentpicture, previous attempts to structure a domain characterised byoverall complexity and local difficulties. In this article, wefirst summarise these attempts, then present an overview of theISO/IEC guidelines for software evaluation (ISO/IEC 9126 andISO/IEC 14598). As an application of these guidelines to machinetranslation software, we introduce FEMTI, a framework that is madeof two interrelated classifications or taxonomies. The firstclassification enables evaluators to define an intended context ofuse, while the links to the second classification generate arelevant quality model (quality characteristics and metrics) forthe respective context. The second classification providesdefinitions of various metrics used by the community. Further on,as part of ongoing, long-term research, we explain how metrics areanalyzed, first from the general point of view of"meta-evaluation", then focusing on examples. Finally, we show howconsensus towards the present framework is sought for, and howfeedback from the community is taken into account in the FEMTIlife-cycle.