Constructing chinese opinion-element collocation dataset using search engine and ontology

  • Authors:
  • Tianfang Yao;Mosha Chen

  • Affiliations:
  • Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China

  • Venue:
  • CLSW'12 Proceedings of the 13th Chinese conference on Chinese Lexical Semantics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a novel approach of constructing an opinion-element collocation dataset for Chinese language. The opinion-element collocation is a collocation whose composition words contain opinion/sentiment element. The dataset is useful for opinion mining task in many aspects. A search engine is used as a fundamental tool mainly because it could help us to seek both domain-specific and domain-independent collocation pairs, and at the same time, an ontology is used as a resource because it can offer rich semantic information to help us to classify collocations into domain-specific or domain-independent type. The tool and resource are combined to build a smart system that can automatically crawl data from the Internet and analyze extracted collocations. In order to ensure the quality of extracted collocations, we evaluate it manually. The experimental results on the COAE2008's public corpus have proved the success of this approach on the four domains.