Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Bootstrapping statistical parsers from small datasets
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Feature-rich part-of-speech tagging with a cyclic dependency network
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Integer linear programming inference for conditional random fields
ICML '05 Proceedings of the 22nd international conference on Machine learning
Sample Selection for Statistical Parsing
Computational Linguistics
Bootstrapping POS taggers using unlabelled data
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Semi-supervised learning for structured output variables
ICML '06 Proceedings of the 23rd international conference on Machine learning
Collective information extraction with relational Markov networks
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Incorporating non-local information into information extraction systems by Gibbs sampling
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Reranking and self-training for parser adaptation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
QuestionBank: creating a corpus of parse-annotated questions
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Non-projective dependency parsing using spanning tree algorithms
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Simple, robust, scalable semi-supervised learning via expectation regularization
Proceedings of the 24th international conference on Machine learning
Learning from measurements in exponential families
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Self-training for biomedical parsing
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
CoNLL-X shared task on multilingual dependency parsing
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Dependency parsing by belief propagation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Adapting a lexicalized-grammar parser to contrasting domains
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Generalized Expectation Criteria for Semi-Supervised Learning with Weakly Labeled Data
The Journal of Machine Learning Research
Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning
Automatic domain adaptation for parsing
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Sparsity in dependency grammar induction
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Posterior Regularization for Structured Latent Variable Models
The Journal of Machine Learning Research
On dual decomposition and linear programming relaxations for natural language processing
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Utilizing extra-sentential context for parsing
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Efficient graph-based semi-supervised learning of structured tagging models
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Dual decomposition for parsing with non-projective head automata
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Collective Inference for Extraction MRFs Coupled with Symmetric Clique Potentials
The Journal of Machine Learning Research
Discriminative probabilistic models for relational data
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
MAP estimation via agreement on trees: message-passing and linear programming
IEEE Transactions on Information Theory
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
State-of-the-art statistical parsers and POS taggers perform very well when trained with large amounts of in-domain data. When training data is out-of-domain or limited, accuracy degrades. In this paper, we aim to compensate for the lack of available training data by exploiting similarities between test set sentences. We show how to augment sentence-level models for parsing and POS tagging with inter-sentence consistency constraints. To deal with the resulting global objective, we present an efficient and exact dual decomposition decoding algorithm. In experiments, we add consistency constraints to the MST parser and the Stanford part-of-speech tagger and demonstrate significant error reduction in the domain adaptation and the lightly supervised settings across five languages.