Identifying high-impact sub-structures for convolution kernels in document-level sentiment classification

  • Authors:
  • Zhaopeng Tu;Yifan He;Jennifer Foster;Josef van Genabith;Qun Liu;Shouxun Lin

  • Affiliations:
  • Institute of Computing Technology, CAS;New York University and Dublin City University;Dublin City University;Dublin City University;Institute of Computing Technology, CAS;Institute of Computing Technology, CAS

  • Venue:
  • ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Convolution kernels support the modeling of complex syntactic information in machine-learning tasks. However, such models are highly sensitive to the type and size of syntactic structure used. It is therefore an important challenge to automatically identify high impact sub-structures relevant to a given task. In this paper we present a systematic study investigating (combinations of) sequence and convolution kernels using different types of substructures in document-level sentiment classification. We show that minimal sub-structures extracted from constituency and dependency trees guided by a polarity lexicon show 1.45 point absolute improvement in accuracy over a bag-of-words classifier on a widely used sentiment corpus.