The logic of typed feature structures
The logic of typed feature structures
Multiword Expressions: A Pain in the Neck for NLP
CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Introduction to the special issue on the web as corpus
Computational Linguistics - Special issue on web as corpus
On building a more efficient grammar by exploiting types
Natural Language Engineering
A compact architecture for dialogue management based on scripts and meta-outputs
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Ambiguity packing in constraint-based parsing: practical results
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Using the web to overcome data sparseness
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A comparison of algorithms for maximum entropy parameter estimation
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Error mining for wide-coverage grammar engineering
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Bootstrapping deep lexical resources: resources for courses
DeepLA '05 Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition
Editorial: Introduction to the special issue on multiword expressions: Having a crack at a hard nut
Computer Speech and Language
The availability of verb-particle constructions in lexical resources: How much is enough?
Computer Speech and Language
A practical classification of multiword expressions
ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Picking them up and figuring them out: verb-particle constructions, noise and idiomaticity
CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
Design and implementation of a lexicon of Dutch multiword expressions
MWE '07 Proceedings of the Workshop on a Broader Perspective on Multiword Expressions
Statistically-driven alignment-based multiword expression identification for technical domains
MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
StringNet as a computational resource for discovering and investigating linguistic constructions
EUCCL '10 Proceedings of the NAACL HLT Workshop on Extracting and Using Constructions in Computational Linguistics
Using unknown word techniques to learn known words
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
A new multiword expression metric and its applications
Journal of Computer Science and Technology - Special issue on natural language processing
Decreasing lexical data sparsity in statistical syntactic parsing: experiments with named entities
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Identification and treatment of multiword expressions applied to information retrieval
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
A rapid method to extract multiword expressions with statistic measures and linguistic rules
WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
Modeling the internal variability of multiword expressions through a pattern-based method
ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 1
Hi-index | 0.00 |
However large a hand-crafted wide-coverage grammar is, there are always going to be words and constructions that are not included in it and are going to cause parse failure. Due to their heterogeneous and flexible nature, Multiword Expressions (MWEs) provide an endless source of parse failures. As the number of such expressions in a speaker's lexicon is equiparable to the number of single word units (Jackendoff, 1997), one major challenge for robust natural language processing systems is to be able to deal with MWEs. In this paper we propose to semi-automatically detect MWE candidates in texts using some error mining techniques and validating them using a combination of the World Wide Web as a corpus and some statistical measures. For the remaining candidates possible lexico-syntactic types are predicted, and they are subsequently added to the grammar as new lexical entries. This approach provides a significant increase in the coverage of these expressions.