Acquiring human-like feature-based conceptual representations from corpora

Authors:
Colin Kelly;Barry Devereux;Anna Korhonen
Affiliations:
University of Cambridge, Cambridge, UK;University of Cambridge, Cambridge, UK;University of Cambridge, Cambridge, UK
Venue:
CN '10 Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics
Year:
2010

Citing 8
Cited 4

An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Open Mind Common Sense: Knowledge Acquisition from the General Public

On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
CLAWS4: the tagging of the British National Corpus

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
The second release of the RASP system

COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions
Automatically Harvesting and Ontologizing Semantic Relations

Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Improving verb clustering with automatically acquired selectional preferences

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Unsupervised and constrained Dirichlet process mixture models for verb clustering

GEMS '09 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics
Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora

CN '10 Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics

Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora

CN '10 Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics
Semi-supervised learning for automatic conceptual property extraction

CMCL '12 Proceedings of the 3rd Workshop on Cognitive Modeling and Computational Linguistics
Using Wikipedia to learn semantic feature representations of concrete concepts in neuroimaging experiments

Artificial Intelligence
A computational model of logical metonymy

ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

The automatic acquisition of feature-based conceptual representations from text corpora can be challenging, given the unconstrained nature of human-generated features. We examine large-scale extraction of concept-relation-feature triples and the utility of syntactic, semantic, and encyclopedic information in guiding this complex task. Methods traditionally employed do not investigate the full range of triples occurring in human-generated norms (e.g. flute produce sound), rather targeting concept-feature pairs (e.g. flute - sound) or triples involving specific relations (e.g. is-a, part-of). We introduce a novel method that extracts candidate triples (e.g. deer have antlers, flute produce sound) from parsed data and re-ranks them using semantic information. We apply this technique to Wikipedia and the British National Corpus and assess its accuracy in a variety of ways. Our work demonstrates the utility of external knowledge in guiding feature extraction, and suggests a number of avenues for future work.