Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
The ups and downs of lexical acquisition
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Automatic rule induction for unknown-word guessing
Computational Linguistics
ANLC '88 Proceedings of the second conference on Applied natural language processing
Noun classification from predicate-argument structures
ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Verb class disambiguation using informative priors
Computational Linguistics
Ranking and Reranking with Perceptron
Machine Learning
A term recognition approach to acronym recognition
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A general feature space for automatic verb classification
Natural Language Engineering
Distributional measures of concept-distance: a task-oriented evaluation
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
WordNet::Similarity: measuring the relatedness of concepts
HLT-NAACL--Demonstrations '04 Demonstration Papers at HLT-NAACL 2004
A program that figures out meanings of words from context
IJCAI'77 Proceedings of the 5th international joint conference on Artificial intelligence - Volume 1
A supervised learning approach to acronym identification
AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence
Semantic interpretation of noun compounds using verbal and other paraphrases
ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 2
Hi-index | 0.00 |
Newly coined words pose problems for natural language processing systems because they are not in a system's lexicon, and therefore no lexical information is available for such words. A common way to form new words is lexical blending, as in cosmeceutical, a blend of cosmetic and pharmaceutical. We propose a statistical model for inferring a blend's source words drawing on observed linguistic properties of blends; these properties are largely based on the recognizability of the source words in a blend. We annotate a set of 1,186 recently coined expressions which includes 515 blends, and evaluate our methods on a 324-item subset. In this first study of novel blends we achieve an accuracy of 40% on the task of inferring a blend's source words, which corresponds to a reduction in error rate of 39% over an informed baseline. We also give preliminary results showing that our features for source word identification can be used to distinguish blends from other kinds of novel words.