Methods for augmenting semantic models with structural information for text classification

  • Authors:
  • Jonathan M. Fishbein;Chris Eliasmith

  • Affiliations:
  • Department of Systems Design Engineering, Centre for Theoretical Neuroscience, University of Waterloo, Waterloo, Canada;Department of Systems Design Engineering and Department of Philosophy, Centre for Theoretical Neuroscience, University of Waterloo, Waterloo, Canada

  • Venue:
  • ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Current representation schemes for automatic text classification treat documents as syntactically unstructured collections of words or 'concepts'. Past attempts to encode syntactic structure have treated part-of-speech information as another word-like feature, but have been shown to be less effective than non-structural approaches. Here, we investigate three methods to augment semantic modelling with syntactic structure, which encode the structure across all features of the document vector while preserving text semantics. We present classification results for these methods versus the Bag-of-Concepts semantic modelling representation to determine which method best improves classification scores.