Procedure for quantitatively comparing the syntactic coverage of English grammars
HLT '91 Proceedings of the workshop on Speech and Natural Language
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Analyzing the effectiveness and applicability of co-training
Proceedings of the ninth international conference on Information and knowledge management
Inference and Estimation of a Long-Range Trigram Model
ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Enhancing Supervised Learning with Unlabeled Data
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning restricted probabilistic link grammars
Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Tagging English text with a probabilistic model
Computational Linguistics
Does Baum-Welch re-estimation help taggers?
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
A practical part-of-speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Exploiting syntactic structure for language modeling
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Probabilistic tree-adjoining grammar as a framework for statistical natural language processing
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Stochastic lexicalized tree-adjoining grammars
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Sample selection for statistical grammar induction
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
A uniform method of grammar extraction and its applications
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Bootstrapping statistical parsers from small datasets
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Recovering latent information in treebanks
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Active learning for statistical natural language parsing
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Applying Co-Training to reference resolution
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Evaluating translational correspondence using annotation projection
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Weakly supervised natural language learning without redundant views
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Example selection for bootstrapping statistical parsers
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Bootstrapping parsers via syntactic projection across parallel texts
Natural Language Engineering
Tri-Training: Exploiting Unlabeled Data Using Three Classifiers
IEEE Transactions on Knowledge and Data Engineering
Sample Selection for Statistical Parsing
Computational Linguistics
Bootstrapping POS taggers using unlabelled data
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Enhancing relevance feedback in image retrieval using unlabeled data
ACM Transactions on Information Systems (TOIS)
Automated extraction of Tree-Adjoining Grammars from treebanks
Natural Language Engineering
A backoff model for bootstrapping resources for non-English languages
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Effective self-training for parsing
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Boosting statistical word alignment using labeled and unlabeled data
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Semisupervised Regression with Cotraining-Style Algorithms
IEEE Transactions on Knowledge and Data Engineering
The bootstrapping of the Yarowsky algorithm in real corpora
Information Processing and Management: an International Journal
Semi-supervised co-training and active learning based approach for multi-view intrusion detection
Proceedings of the 2009 ACM symposium on Applied Computing
When does Co-training Work in Real Data?
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
MCS '09 Proceedings of the 8th International Workshop on Multiple Classifier Systems
Extractive summarization using supervised and semi-supervised learning
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Semi-supervised learning with very few labeled training examples
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Spoken language understanding using weakly supervised learning
Computer Speech and Language
Active learning with multiple views
Journal of Artificial Intelligence Research
Gesture salience as a hidden variable for coreference resolution and keyframe extraction
Journal of Artificial Intelligence Research
Automatically extracting and comparing lexicalized grammars for different languages
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Semi-supervised regression with co-training
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Co-training for cross-lingual sentiment classification
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Weakly-Supervised Violence Detection in Movies with Audio and Video Based Co-training
PCM '09 Proceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Faster parsing by supertagger adaptation
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Effective constituent projection across languages
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Analyzing and integrating dependency parsers
Computational Linguistics
Automatically building training examples for entity extraction
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Bilingual co-training for sentiment classification of chinese product reviews
Computational Linguistics
SETRED: self-training with editing
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Relaxed cross-lingual projection of constituent syntax
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Lateen EM: unsupervised training with multiple objectives, applied to dependency grammar induction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
DCPE co-training for classification
Neurocomputing
Robust conversion of CCG derivations to phrase structure trees
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Iterative annotation transformation with predict-self reestimation for Chinese word segmentation
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Improving multi-view semi-supervised learning with agreement-based sampling
Intelligent Data Analysis - Combined Learning Methods and Mining Complex Data
Hi-index | 0.01 |
We propose a novel Co-Training method for statistical parsing. The algorithm takes as input a small corpus (9695 sentences) annotated with parse trees, a dictionary of possible lexicalized structures for each word in the training set and a large pool of unlabeled text. The algorithm iteratively labels the entire data set with parse trees. Using empirical results based on parsing the Wall Street Journal corpus we show that training a statistical parser on the combined labeled and unlabeled data strongly out-performs training only on the labeled data.