Improving Generalization with Active Learning
Machine Learning - Special issue on structured connectionist systems
Selective Sampling Using the Query by Committee Algorithm
Machine Learning
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Active Learning for Natural Language Parsing and Information Extraction
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Enhancing Supervised Learning with Unlabeled Data
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Selective Sampling with Redundant Views
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Head-driven statistical models for natural language parsing
Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Minimizing manual annotation cost in supervised training from corpora
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Bootstrapping statistical parsers from small datasets
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Active learning for statistical natural language parsing
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Applying co-training methods to statistical parsing
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Rule writing or annotation: cost-efficient resource usage for base noun phrase chunking
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Sample selection for statistical grammar induction
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Sample Selection for Statistical Parsing
Computational Linguistics
Updating an NLP system to fit new domains: an empirical study on the sentence segmentation problem
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Bootstrapping coreference classifiers with multiple machine learning algorithms
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Multi-criteria-based active learning for named entity recognition
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A robust multilingual portable phrase chunking system
Expert Systems with Applications: An International Journal
Active learning for logistic regression: an evaluation
Machine Learning
Innovations in Natural Language Document Processing for Requirements Engineering
Innovations for Requirement Analysis. From Stakeholders' Needs to Formal Designs
Porting a lexicalized-grammar parser to the biomedical domain
Journal of Biomedical Informatics
MAP adaptation of stochastic grammars
Computer Speech and Language
Evaluating a statistical CCG parser on Wikipedia
People's Web '09 Proceedings of the 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources
HITS-based seed selection and stop list construction for bootstrapping
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Chinese chunking with tri-training learning
ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
A word clustering approach to domain adaptation: effective parsing of biomedical texts
IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
Hi-index | 0.00 |
This paper investigates bootstrapping for statistical parsers to reduce their reliance on manually annotated training data. We consider both a mostly-unsupervised approach, cotraining, in which two parsers are iteratively re-trained on each other's output; and a semi-supervised approach, corrected co-training, in which a human corrects each parser's output before adding it to the training data. The selection of labeled training examples is an integral part of both frameworks. We propose several selection methods based on the criteria of minimizing errors in the data and maximizing training utility. We show that incorporating the utility criterion into the selection method results in better parsers for both frameworks.