Improvement of machine translation evaluation by simple linguistically motivated features

Authors:
Mu-Yun Yang;Shu-Qi Sun;Jun-Guo Zhu;Sheng Li;Tie-Jun Zhao;Xiao-Ning Zhu
Affiliations:
School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
Venue:
Journal of Computer Science and Technology - Special issue on natural language processing
Year:
2011

Citing 12
Cited 0

Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
A program for aligning sentences in bilingual corpora

Computational Linguistics - Special issue on using large corpora: I
A machine learning approach to the automatic evaluation of machine translation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
MT evaluation: human-like vs. human acceptable

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Sentence level machine translation evaluation as a ranking problem: one step aside from BLEU

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Linguistic features for automatic evaluation of heterogenous MT systems

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Further meta-evaluation of machine translation

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Ranking vs. regression in machine translation evaluation

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
A smorgasbord of features for automatic MT evaluation

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
A Quantitative Analysis of Linguistic Factors in Human Translation Evaluation

KAM '09 Proceedings of the 2009 Second International Symposium on Knowledge Acquisition and Modeling - Volume 01

Quantified Score

Hi-index	0.00

Visualization

Abstract

Adopting the regression SVM framework, this paper proposes a linguistically motivated feature engineering strategy to develop an MT evaluation metric with a better correlation with human assessments. In contrast to current practices of "greedy" combination of all available features, six features are suggested according to the human intuition for translation quality. Then the contribution of linguistic features is examined and analyzed via a hill-climbing strategy. Experiments indicate that, compared to either the SVM-ranking model or the previous attempts on exhaustive linguistic features, the regression SVM model with six linguistic information based features generalizes across different datasets better, and augmenting these linguistic features with proper non-linguistic metrics can achieve additional improvements.