Word association norms, mutual information, and lexicography
Computational Linguistics
Foundations of statistical natural language processing
Foundations of statistical natural language processing
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
EPIA '99 Proceedings of the 9th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Multiword Expressions: A Pain in the Neck for NLP
CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Retrieving collocations from text: Xtract
Computational Linguistics - Special issue on using large corpora: I
Methods for the qualitative evaluation of lexical association measures
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Combining association measures for collocation extraction
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A linguistic knowledge discovery tool: very large ngram database search with arbitrary wildcards
COLING '08 22nd International Conference on on Computational Linguistics: Demonstration Papers
Improving statistical machine translation using domain bilingual multiword expressions
MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
The role of multi-word units in interactive information retrieval
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Discriminative strategies to integrate multiword expression recognition and parsing
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Parsing models for identifying multiword expressions
Computational Linguistics
Combining compound recognition and PCFG-LA parsing with word lattices and conditional random fields
ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 2
Hi-index | 0.00 |
The identification and extraction of Multiword Expressions (MWEs) currently deliver satisfactory results. However, the integration of these results into a wider application remains an issue. This is mainly due to the fact that the association measures (AMs) used to detect MWEs require a critical amount of data and that the MWE dictionaries cannot account for all the lexical and syntactic variations inherent in MWEs. In this study, we use an alternative technique to overcome these limitations. It consists in defining an n-gram frequency data-base that can be used to compute AMs on-the-fly, allowing the extraction procedure to efficiently process all the MWEs in a text, even if they have not been previously observed.