Head-driven statistical models for natural language parsing
Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
The domain dependence of parsing
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Distributional part-of-speech tagging
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Reranking and self-training for parser adaptation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Effective self-training for parsing
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
TAG, dynamic programming, and the perceptron for efficient, feature-rich parsing
CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
Online methods for multi-domain learning and adaptation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Automatic prediction of parser accuracy
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hierarchical Bayesian domain adaptation
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
MAP adaptation of stochastic grammars
Computer Speech and Language
An empirical study of semi-supervised structured conditional models for dependency parsing
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
We're not in Kansas anymore: detecting domain changes in streams
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Disentangling chat with local coherence models
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Exploiting web-derived selectional preference to improve statistical dependency parsing
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Effective measures of domain similarity for parsing
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Parsing natural language queries for life science knowledge
BioNLP '11 Proceedings of BioNLP 2011 Workshop
Adapting text instead of the model: an open domain approach
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Cross-Domain Effects on Parse Selection for Precision Grammars
Research on Language and Computation
Sentence-level instance-weighting for graph-based and transition-based dependency parsing
IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
Minimally supervised domain-adaptive parse reranking for relation extraction
IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
Comparing the use of edited and unedited text in parser self-training
IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
Data point selection for self-training
SPMRL '11 Proceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages
Improved parsing and POS tagging using inter-sentence consistency constraints
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
Current statistical parsers tend to perform well only on their training domain and nearby genres. While strong performance on a few related domains is sufficient for many situations, it is advantageous for parsers to be able to generalize to a wide variety of domains. When parsing document collections involving heterogeneous domains (e.g. the web), the optimal parsing model for each document is typically not obvious. We study this problem as a new task --- multiple source parser adaptation. Our system trains on corpora from many different domains. It learns not only statistics of those domains but quantitative measures of domain differences and how those differences affect parsing accuracy. Given a specific target text, the resulting system proposes linear combinations of parsing models trained on the source corpora. Tested across six domains, our system outperforms all non-oracle baselines including the best domain-independent parsing model. Thus, we are able to demonstrate the value of customizing parsing models to specific domains.