Lexical knowledge acquisition from bilingual corpora

  • Authors:
  • Takehito Utsuro;Yuji Matsumoto;Makoto Nagao

  • Affiliations:
  • Kyoto University, Kyoto, Japan;Kyoto University, Kyoto, Japan;Kyoto University, Kyoto, Japan

  • Venue:
  • COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

For practical research in natural language processing, it is indispensable to develop a large scale semantic dictionary for computers. It is especially important to improve the techniques for compiling semantic dictionaries from natural language texts such as those in existing human dictionaries or in large corpora. However, there are at least two difficulties in analyzing existing texts: the problem of syntactic ambiguities and the problem of polysemy. Our approach to solve these difficulties is to make use of translation examples in two distinct languages that have quite different syntactic structures and word meanings. The reason we took this approach is that in many cases both syntactic and semantic ambiguities are resolved by comparing analyzed results from both languages. In this paper, we propose a method for resolving the syntactic ambiguities of translation examples of bilingual corpora and a method for acquiring lexical knowledge, such as case frames of verbs and attribute sets of nouns.