Proceedings of the Second Workshop on Statistical Machine Translation

  • Authors:
  • Chris Callison-Burch;Philipp Koehn;Christof Monz;Cameron Shaw Fordyce

  • Affiliations:
  • Johns Hopkins University;University of Edinburgh;Queen Mary, University of London;Center for the Evaluation of Language and Communication Technologies

  • Venue:
  • StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The ACL 2007 Workshop on Statistical Machine Translation (WMT-07) took place on Saturday, June 23 in Prague, Czech Republic, immediately preceding the annual meeting of the Association for Computational Linguistics, which was hosted by Charles University. This was the second time this workshop had been held, following the first workshop at the 2006 HLT-NAACL conference. But its ancestry can be traced back farther to the ACL 2005 Workshop on Building and Using Parallel Texts (when we started our evaluation campaign on European languages), and even the ACL 2001 Workshop on Data-Driven Machine Translation (which was the first ACL workshop mostly directed at statistical machine translation). Over the last years, interest in statistical machine translation has been risen dramatically. We received an overwhelming number of full paper submission for a one-day workshop, 38 in total. Given our limited capacity, we were only able to accept 12 full papers for oral presentation and 9 papers for poster presentation, an acceptance rate of 55%. In a second poster session, 16 additional shared task papers were presented. The workshop also featured an invited talk by Jean Senellart of SYSTRAN Language Translation Technology, Paris. Prior to the workshop, in addition to soliciting relevant papers for review and possible presentation we conducted a shared task that brought together machine translation systems for an evaluation on previously unseen data. This year's task resembled the shared tasks of previous years in many ways. Its focus was again the translation of European languages, using a relatively large training corpus. This year, we included a variety of manual evaluations of the MT systems' outputs, and a variety of automated evaluation metrics. Also, as a special challenge this year, we posed the problem of domain adaptation. The results of the shared task were announced at the workshop, and these proceedings also include an overview paper for the shared task that summarizes the results, as well as provides information about the data used and any procedures that were followed in conducting or scoring the task. In addition, there are short papers from each participating team that describe their underlying system in some detail.