Automatically identifying the source words of lexical blends in english

Authors:
Paul Cook;Suzanne Stevenson
Affiliations:
-;-
Venue:
Computational Linguistics
Year:
2010

Citing 13
Cited 1

Some advances in transformation-based part of speech tagging

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
The ups and downs of lexical acquisition

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Automatic rule induction for unknown-word guessing

Computational Linguistics
Cn yur cmputr raed ths?

ANLC '88 Proceedings of the second conference on Applied natural language processing
Noun classification from predicate-argument structures

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Verb class disambiguation using informative priors

Computational Linguistics
Ranking and Reranking with Perceptron

Machine Learning
A term recognition approach to acronym recognition

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A general feature space for automatic verb classification

Natural Language Engineering
Distributional measures of concept-distance: a task-oriented evaluation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
WordNet::Similarity: measuring the relatedness of concepts

HLT-NAACL--Demonstrations '04 Demonstration Papers at HLT-NAACL 2004
A program that figures out meanings of words from context

IJCAI'77 Proceedings of the 5th international joint conference on Artificial intelligence - Volume 1
A supervised learning approach to acronym identification

AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence

Semantic interpretation of noun compounds using verbal and other paraphrases

ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Newly coined words pose problems for natural language processing systems because they are not in a system's lexicon, and therefore no lexical information is available for such words. A common way to form new words is lexical blending, as in cosmeceutical, a blend of cosmetic and pharmaceutical. We propose a statistical model for inferring a blend's source words drawing on observed linguistic properties of blends; these properties are largely based on the recognizability of the source words in a blend. We annotate a set of 1,186 recently coined expressions which includes 515 blends, and evaluate our methods on a 324-item subset. In this first study of novel blends we achieve an accuracy of 40% on the task of inferring a blend's source words, which corresponds to a reduction in error rate of 39% over an informed baseline. We also give preliminary results showing that our features for source word identification can be used to distinguish blends from other kinds of novel words.