Semi-supervised discourse relation classification with structural learning

  • Authors:
  • Hugo Hernault;Danushka Bollegala;Mitsuru Ishizuka

  • Affiliations:
  • Graduate School of Information Science & Technology, The University of Tokyo, Bunkyo-ku, Tokyo, Japan;Graduate School of Information Science & Technology, The University of Tokyo, Bunkyo-ku, Tokyo, Japan;Graduate School of Information Science & Technology, The University of Tokyo, Bunkyo-ku, Tokyo, Japan

  • Venue:
  • CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The corpora available for training discourse relation classifiers are annotated using a general set of discourse relations. However, for certain applications, custom discourse relations are required. Creating a new annotated corpus with a new relation taxonomy is a timeconsuming and costly process. We address this problem by proposing a semi-supervised approach to discourse relation classification based on Structural Learning. First, we solve a set of auxiliary classification problems using unlabeled data. Second, the learned classifiers are used to extend feature vectors to train a discourse relation classifier. By defining a relevant set of auxiliary classification problems, we show that the proposed method brings improvement of at least 50% in accuracy and F-score on the RST Discourse Treebank and Penn Discourse Treebank, when small training sets of ca. 1000 training instances are employed. This is an attractive perspective for training discourse relation classifiers on domains where little amount of labeled training data is available.