Symmetric pattern matching analysis for English coordinate structures

  • Authors:
  • Akitoshi Okumura;Kazunori Muraki

  • Affiliations:
  • NEC Corp. Information Technology Research Laboratories, Miyazaki, Miyamae-ku, Kawasaki, Kanagawa, Japan;NEC Corp. Information Technology Research Laboratories, Miyazaki, Miyamae-ku, Kawasaki, Kanagawa, Japan

  • Venue:
  • ANLC '94 Proceedings of the fourth conference on Applied natural language processing
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

The authors propose a model for analyzing English sentences including coordinate conjunctions such as "and","or","but" and the equivalent words. Syntactic analysis of the English coordinate sentences is one of the most difficult problems for machine translation (MT) systems. The problem is selecting, from all possible candidates, the correct syntactic structure formed by an individual coordinate conjunction, i.e. determining which constituents are coordinated by the conjunction. Typically, so many possible structures are produced that MT systems cannot select the correct one, even if the grammars allow to write the rules in the simple notations. This paper presents an English coordinate structure analysis model, which provides top-down scope information of the correct syntactic structure by taking advantage of the symmetric patterns of the parallelism. The model is based on a balance matching operation for two lists of the feature sets, which provides four effects: the reduction of analysis cost, the improvement of word disambiguation, the interpretation of ellipses, and robust analysis. This model was practically implemented and incorporated into the English-Japanese MT system, and provided about 75% accuracy in the practical translation use.