Procedure for quantitatively comparing the syntactic coverage of English grammars
HLT '91 Proceedings of the workshop on Speech and Natural Language
Theoretical Computer Science - Special issue on implementing automata
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multiword Expressions: A Pain in the Neck for NLP
CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Lexicon-grammar: the representation of compound words
COLING '86 Proceedings of the 11th coference on Computational linguistics
Probabilistic CFG with latent annotations
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Lexicalization in crosslinguistic probabilistic parsing: the case of French
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Joint parsing and named entity recognition
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Improving generative statistical parsing with semi-supervised word clustering
IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Products of random latent variable grammars
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Practical very large scale CRFs
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Handling unknown words in statistical latent-variable parsing models for Arabic, English and French
SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Decreasing lexical data sparsity in statistical syntactic parsing: experiments with named entities
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Tree-rewriting models of multi-word expressions
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
MWU-aware part-of-speech tagging with a CRF model and lexical resources
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
An n-gram frequency database reference to handle MWE extraction in NLP applications
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Multiword expressions in statistical dependency parsing
SPMRL '11 Proceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages
Discriminative strategies to integrate multiword expression recognition and parsing
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Spectral learning of latent-variable PCFGs
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Hi-index | 0.00 |
The integration of compounds in a parsing procedure has been shown to improve accuracy in an artificial context where such expressions have been perfectly preidentified. This article evaluates two empirical strategies to incorporate such multiword units in a real PCFG-LA parsing context: (1) the use of a grammar including compound recognition, thanks to specialized annotation schemes for compounds; (2) the use of a state-of-the-art discriminative compound prerecognizer integrating endogenous and exogenous features. We show how these two strategies can be combined with word lattices representing possible lexical analyses generated by the recognizer. The proposed systems display significant gains in terms of multiword recognition and often in terms of standard parsing accuracy. Moreover, we show through an Oracle analysis that this combined strategy opens promising new research directions.