An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Open Mind Common Sense: Knowledge Acquisition from the General Public
On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
CLAWS4: the tagging of the British National Corpus
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
The second release of the RASP system
COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions
Automatically Harvesting and Ontologizing Semantic Relations
Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Improving verb clustering with automatically acquired selectional preferences
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Unsupervised and constrained Dirichlet process mixture models for verb clustering
GEMS '09 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics
CN '10 Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics
CN '10 Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics
Semi-supervised learning for automatic conceptual property extraction
CMCL '12 Proceedings of the 3rd Workshop on Cognitive Modeling and Computational Linguistics
A computational model of logical metonymy
ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 2
Hi-index | 0.00 |
The automatic acquisition of feature-based conceptual representations from text corpora can be challenging, given the unconstrained nature of human-generated features. We examine large-scale extraction of concept-relation-feature triples and the utility of syntactic, semantic, and encyclopedic information in guiding this complex task. Methods traditionally employed do not investigate the full range of triples occurring in human-generated norms (e.g. flute produce sound), rather targeting concept-feature pairs (e.g. flute - sound) or triples involving specific relations (e.g. is-a, part-of). We introduce a novel method that extracts candidate triples (e.g. deer have antlers, flute produce sound) from parsed data and re-ranks them using semantic information. We apply this technique to Wikipedia and the British National Corpus and assess its accuracy in a variety of ways. Our work demonstrates the utility of external knowledge in guiding feature extraction, and suggests a number of avenues for future work.