The Journal of Machine Learning Research
Non-projective dependency parsing using spanning tree algorithms
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Using the web as an implicit training set: application to structural ambiguity resolution
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Effective self-training for parsing
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Introduction to the bio-entity recognition task at JNLPBA
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Overview of BioNLP'09 shared task on event extraction
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task
A structured vector space model for word meaning in context
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Using self-trained bilexical preferences to improve disambiguation accuracy
IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
An efficient algorithm for easy-first non-directional dependency parsing
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A latent dirichlet allocation method for selectional preferences
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Latent variable models of selectional preference
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
From frequency to meaning: vector space models of semantics
Journal of Artificial Intelligence Research
Exploiting web-derived selectional preference to improve statistical dependency parsing
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Developing a robust part-of-speech tagger for biomedical text
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Exploring supervised lda models for assigning attributes to adjective-noun phrases
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
When porting parsers to a new domain, many of the errors are related to wrong attachment of out-of-vocabulary words. Since there is no available annotated data to learn the attachment preferences of the target domain words, we attack this problem using a model of selectional preferences based on domain-specific word classes. Our method uses Latent Dirichlet Allocations (LDA) to learn a domain-specific Selectional Preference model in the target domain using un-annotated data. The model provides features that model the affinities among pairs of words in the domain. To incorporate these new features in the parsing model, we adopt the co-training approach and retrain the parser with the selectional preferences features. We apply this method for adapting Easy First, a fast non-directional parser trained on WSJ, to the biomedical domain (Genia Treebank). The Selectional Preference features reduce error by 4.5% over the co-training baseline.