Paraphrasing spoken Chinese using a paraphrase corpus

  • Authors:
  • Yujie Zhang;Kazuhide Yamamoto

  • Affiliations:
  • National Institute of Information and Communications Technology, 3-5, Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0289, Japan e-mail: yujie@nict.go.jp;Nagaoka University of Technology, Niigata 940-2188 Japan e-mail: yamamoto@fw.ipsj.or.jp

  • Venue:
  • Natural Language Engineering
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the key issues in spoken-language translation is how to deal with unrestricted expressions in spontaneous utterances. We have developed a paraphraser for use as part of a translation system, and in this paper we describe the implementation of a Chinese paraphraser for a Chinese-Japanese spoken-language translation system. When an input sentence cannot be translated by the transfer engine, the paraphraser automatically transforms the sentence into alternative expressions until one of these alternatives can be translated by the transfer engine. Two primary issues must be dealt with in paraphrasing: how to determine new expressions, and how to retain the meaning of the input sentence. We use a pattern-based approach in which the meaning is retained to the greatest possible extent without deep parsing. The paraphrase patterns are acquired from a paraphrase corpus and human experience. The paraphrase instances are automatically extracted and then generalized into paraphrase patterns. A total of 1719 paraphrase patterns obtained using this method and an implemented paraphraser were used in a paraphrasing experiment. The results showed that the implemented paraphraser generated 1.7 paraphrases on average for each test sentence and achieved an accuracy of 88%.