Semi-supervised learning for automatic conceptual property extraction

Authors:
Colin Kelly;Barry Devereux;Anna Korhonen
Affiliations:
University of Cambridge, Cambridge, UK;University of Cambridge, Cambridge, UK;University of Cambridge, Cambridge, UK
Venue:
CMCL '12 Proceedings of the 3rd Workshop on Cognitive Modeling and Computational Linguistics
Year:
2012

Citing 14
Cited 0

Word association norms, mutual information, and lexicography

Computational Linguistics
Support-Vector Networks

Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Scaling multi-class support vector machines using inter-class confusion

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
On the algorithmic implementation of multiclass kernel-based vector machines

The Journal of Machine Learning Research
Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
NLTK: the natural language toolkit

COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions
Wide-coverage efficient statistical parsing with ccg and log-linear models

Computational Linguistics
Automatically Harvesting and Ontologizing Semantic Relations

Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Open information extraction using Wikipedia

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Towards Unrestricted, Large-Scale Acquisition of Feature-Based Conceptual Representations from Corpus Data

Research on Language and Computation
Acquiring human-like feature-based conceptual representations from corpora

CN '10 Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics
Open information extraction: the second generation

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume One

Quantified Score

Hi-index	0.00

Visualization

Abstract

For a given concrete noun concept, humans are usually able to cite properties (e.g., elephant is animal, car has wheels) of that concept; cognitive psychologists have theorised that such properties are fundamental to understanding the abstract mental representation of concepts in the brain. Consequently, the ability to automatically extract such properties would be of enormous benefit to the field of experimental psychology. This paper investigates the use of semi-supervised learning and support vector machines to automatically extract concept-relation-feature triples from two large corpora (Wikipedia and UKWAC) for concrete noun concepts. Previous approaches have relied on manually-generated rules and hand-crafted resources such as WordNet; our method requires neither yet achieves better performance than these prior approaches, measured both by comparison with a property norm-derived gold standard as well as direct human evaluation. Our technique performs particularly well on extracting features relevant to a given concept, and suggests a number of promising areas for future focus.