Automatic extraction of chinese V-N collocations

  • Authors:
  • Xiaofei Qian

  • Affiliations:
  • College of Liberal Arts, Shanghai University, Shanghai, China

  • Venue:
  • CLSW'12 Proceedings of the 13th Chinese conference on Chinese Lexical Semantics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Chinese V-N collocations have two possible structural relations: verb-object relation and attributive-head relation. Both of them are widely used in Chinese language processing tasks, but long distance and low frequency collocations are often difficult to extract. A weighted mutual information (WMI) model and a rule-based method were designed to acquire V-N collocations by taking more syntactic structure features into consideration. The WMI model extracted verb-object collocation within clauses. It reduced the interference of illegal collocates and highlighted the weight of long distance collocates, by giving different weights to collocates in different locations. The rule-based method used part of speech patterns to extract verb-object and attributive-head collocations, and inferred implicit collocations. The experiments show that, the WMI model optimizes evaluation scores of long distance collocations, while the rule-based method is more accurate in extracting and distinguishing the two kinds of collocations, including low frequency collocations.