Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics

Authors:
Chin-Yew Lin;Franz Josef Och
Affiliations:
University of Southern California, Marina del Rey, CA;University of Southern California, Marina del Rey, CA
Venue:
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Year:
2004

Citing 7
Cited 56

Information Retrieval

Information Retrieval
Discovering word senses from text

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A new quantitative quality measure for machine translation systems

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Meta-evaluation of summaries in a cross-lingual environment using content-based metrics

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Learning to paraphrase: an unsupervised approach using multiple-sequence alignment

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Precision and recall of machine translation

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
ORANGE: a method for evaluating automatic evaluation metrics for machine translation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics

A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
ORANGE: a method for evaluating automatic evaluation metrics for machine translation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Kernel-based approach for automatic evaluation of natural language generation technologies: application to automatic summarization

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
BLANC: learning evaluation metrics for MT

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
The Hiero machine translation system: extensions, evaluation, and analysis

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
MT evaluation: human-like vs. human acceptable

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Stochastic iterative alignment for machine translation evaluation

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Supervised automatic evaluation for summarization with voted regression model

Information Processing and Management: an International Journal
Document concept lattice for text understanding and summarization

Information Processing and Management: an International Journal
Statistical machine translation

ACM Computing Surveys (CSUR)
Latent dirichlet allocation based multi-document summarization

Proceedings of the second workshop on Analytics for noisy unstructured text data
Regression for machine translation evaluation at the sentence level

Machine Translation
Monotone Increasing Binary Similarity and Its Application to Automatic Document-Acquisition of a Category

IEICE - Transactions on Information and Systems
Enriching Statistical Translation Models Using a Domain-Independent Multilingual Lexical Knowledge Base

CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
A re-examination on features in regression based approach to automatic MT evaluation

HLT-SRWS '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Student Research Workshop
Diagnostic evaluation of machine translation systems using automatically constructed linguistic check-points

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Re-evaluating machine translation results with paraphrase support

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
References extension for the automatic evaluation of MT by syntactic hybridization

SSST '09 Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation
Context-aware discriminative phrase selection for statistical machine translation

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Sentence level machine translation evaluation as a ranking problem: one step aside from BLEU

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Linguistic features for automatic evaluation of heterogenous MT systems

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Ranking vs. regression in machine translation evaluation

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Findings of the 2009 workshop on statistical machine translation

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
On the robustness of syntactic and semantic features for automatic MT evaluation

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Diagnosing meaning errors in short answers to reading comprehension questions

EANL '08 Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications
The LDV-COMBO system for SMT

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
The contribution of linguistic features to automatic machine translation evaluation

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Mining search engine clickthrough log for matching N-gram features

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Automatic generation of bid phrases for online advertising

Proceedings of the third ACM international conference on Web search and data mining
CONANN: an online biomedical concept annotator

DILS'07 Proceedings of the 4th international conference on Data integration in the life sciences
Constructing query-biased summaries: a comparison of human and system generated snippets

Proceedings of the third symposium on Information interaction in context
Automatic evaluation method for machine translation using noun-phrase chunking

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
LRscore for evaluating lexical and reordering quality in MT

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Taming structured perceptrons on wild feature vectors

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
CONE: metrics for automatic evaluation of named entity co-reference resolution

NEWS '10 Proceedings of the 2010 Named Entities Workshop
All in strings: a powerful string-based automatic MT evaluation metric with multiple granularities

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Improvement of machine translation evaluation by simple linguistically motivated features

Journal of Computer Science and Technology - Special issue on natural language processing
Linguistic measures for automatic machine translation evaluation

Machine Translation
The CMU-EBMT machine translation system

Machine Translation
A stylometric study and assessment of machine translators

IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X
An approach for textual entailment recognition based on stacking and voting

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Corroborating text evaluation results with heterogeneous measures

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hungarian-English machine translation using genpar

TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
The role and resolution of textual entailment in natural language processing applications

NLDB'06 Proceedings of the 11th international conference on Applications of Natural Language to Information Systems
HPSG-Based Preprocessing for English-to-Japanese Translation

ACM Transactions on Asian Language Information Processing (TALIP)
PolyUCOMP: combining semantic vectors with skip bigrams for semantic textual similarity

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
FBK: machine translation evaluation and word similarity metrics for semantic textual similarity

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Full machine translation for factoid question answering

EACL 2012 Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)
DLSITE-1: lexical analysis for solving textual entailment recognition

NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Automatic multi-document summarization based on new sentence similarity measures

PRICAI'12 Proceedings of the 12th Pacific Rim international conference on Trends in Artificial Intelligence
A multidimensional approach for detecting irony in Twitter

Language Resources and Evaluation
Assessing the accuracy of discourse connective translations: validation of an automatic metric

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
Textual evidence gathering and analysis

IBM Journal of Research and Development
Assessing the influence of personal preferences on the choice of vocabulary for natural language generation

Information Processing and Management: an International Journal
A new benchmark dataset with production methodology for short text semantic similarity algorithms

ACM Transactions on Speech and Language Processing (TSLP)
Fusion of word and letter based metrics for automatic MT evaluation

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we describe two new objective automatic evaluation methods for machine translation. The first method is based on longest common subsequence between a candidate translation and a set of reference translations. Longest common subsequence takes into account sentence level structure similarity naturally and identifies longest co-occurring in-sequence n-grams automatically. The second method relaxes strict n-gram matching to skip-bigram matching. Skip-bigram is any pair of words in their sentence order. Skip-bigram cooccurrence statistics measure the overlap of skip-bigrams between a candidate translation and a set of reference translations. The empirical results show that both methods correlate with human judgments very well in both adequacy and fluency.