Statistical Approach for Korean Analysis: A Method Based on Structural Patterns

  • Authors:
  • Nari Kim

  • Affiliations:
  • -

  • Venue:
  • AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

In conventional approaches to Korean analysis, verb sub-categorization has generally been used as lexical knowledge. A problem arises, however, when we are given long sentences in which two or more verbs of the same subcategorization are involved. In those sentences, a noun phrase may be taken as the constituent of more than one verb and cause an ambiguity. This paper presents an approach to solving this problem by using structural patterns acquired by a statistical method from corpora. Structural patterns can be the processing units for syntactic analysis and for translation into other languages as well. We have collected 10,686 unique structural patterns from a Korean corpus of 1.27 million words. We have analyzed 2,672 sentences and shown that structural patterns can improve the accuracy of Korean analysis.