Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Retrieving collocations from text: Xtract
Computational Linguistics - Special issue on using large corpora: I
Accurate collocation extraction using a multilingual parser
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Detecting multiword verbs in the English sublanguage of MEDLINE abstracts
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Discovering Compound and Proper Nouns
RSEISP '07 Proceedings of the international conference on Rough Sets and Intelligent Systems Paradigms
Expert Systems with Applications: An International Journal
Improving effectiveness of mutual information for substantival multiword expression extraction
Expert Systems with Applications: An International Journal
Multilingual collocation extraction: issues and solutions
MLRI '06 Proceedings of the Workshop on Multilingual Language Resources and Interoperability
Non-contiguous word sequences for information retrieval
MWE '04 Proceedings of the Workshop on Multiword Expressions: Integrating Processing
Induction of syntactic collocation patterns from generic syntactic relations
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Editorial: Introduction to the special issue on multiword expressions: Having a crack at a hard nut
Computer Speech and Language
Comparing and combining a semantic tagger and a statistical tool for MWE extraction
Computer Speech and Language
Multi-word expression identification using sentence surface features
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
MWU-aware part-of-speech tagging with a CRF model and lexical resources
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence
A generic framework for multiword expressions treatment: from acquisition to applications
ACL '12 Proceedings of ACL 2012 Student Research Workshop
Learning to detect english and hungarian light verb constructions
ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 1
Hi-index | 0.00 |
This paper describes an original hybrid system that extracts multiword unit candidates from part-of-speech tagged corpora. While classical hybrid systems manually define local part-of-speech patterns that lead to the identification of well-known multiword units (mainly compound nouns), our solution automatically identifies relevant syntactical patterns from the corpus. Word statistics are then combined with the endogenously acquired linguistic information in order to extract the most relevant sequences of words. As a result, (1) human intervention is avoided providing total flexibility of use of the system and (2) different multiword units like phrasal verbs, adverbial locutions and prepositional locutions may be identified. The system has been tested on the Brown Corpus leading to encouraging results.