Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Multiword unit hybrid extraction
MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
A nonparametric method for extraction of candidate phrasal terms
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Collocation extraction based on modifiability statistics
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Japanese idiom recognition: drawing a line between literal and idiomatic meanings
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Unsupervised recognition of literal and non-literal use of idiomatic expressions
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Representation and treatment of multiword expressions in Basque
MWE '04 Proceedings of the Workshop on Multiword Expressions: Integrating Processing
MWE '04 Proceedings of the Workshop on Multiword Expressions: Integrating Processing
Automatic identification of non-compositional multi-word expressions using latent semantic analysis
MWE '06 Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
Design and implementation of a lexicon of Dutch multiword expressions
MWE '07 Proceedings of the Workshop on a Broader Perspective on Multiword Expressions
MWE '07 Proceedings of the Workshop on a Broader Perspective on Multiword Expressions
Fully unsupervised core-adjunct argument classification
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Linguistic cues for distinguishing literal and non-literal usages
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Two-Word Collocation Extraction Using Monolingual Word Alignment Method
ACM Transactions on Intelligent Systems and Technology (TIST)
Hi-index | 0.00 |
Much NLP research on Multi-Word Expressions (MWEs) focuses on the discovery of new expressions, as opposed to the identification in texts of known expressions. However, MWE identification is not trivial because many expressions allow variation in form and differ in the range of variations they allow. We show that simple rule-based baselines do not perform identification satisfactorily, and present a supervised learning method for identification that uses sentence surface features based on expressions' canonical form. To evaluate the method, we have annotated 3350 sentences from the British National Corpus, containing potential uses of 24 verbal MWEs. The method achieves an F-score of 94.86%, compared with 80.70% for the leading rule-based baseline. Our method is easily applicable to any expression type. Experiments in previous research have been limited to the compositional/non-compositional distinction, while we also test on sentences in which the words comprising the MWE appear but not as an expression.