Semi-supervised learning for automatic conceptual property extraction

  • Authors:
  • Colin Kelly;Barry Devereux;Anna Korhonen

  • Affiliations:
  • University of Cambridge, Cambridge, UK;University of Cambridge, Cambridge, UK;University of Cambridge, Cambridge, UK

  • Venue:
  • CMCL '12 Proceedings of the 3rd Workshop on Cognitive Modeling and Computational Linguistics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

For a given concrete noun concept, humans are usually able to cite properties (e.g., elephant is animal, car has wheels) of that concept; cognitive psychologists have theorised that such properties are fundamental to understanding the abstract mental representation of concepts in the brain. Consequently, the ability to automatically extract such properties would be of enormous benefit to the field of experimental psychology. This paper investigates the use of semi-supervised learning and support vector machines to automatically extract concept-relation-feature triples from two large corpora (Wikipedia and UKWAC) for concrete noun concepts. Previous approaches have relied on manually-generated rules and hand-crafted resources such as WordNet; our method requires neither yet achieves better performance than these prior approaches, measured both by comparison with a property norm-derived gold standard as well as direct human evaluation. Our technique performs particularly well on extracting features relevant to a given concept, and suggests a number of promising areas for future focus.