Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Incorporating Prior Knowledge into Boosting
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Incorporating prior knowledge with weighted margin support vector machines
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Text chunking using regularized Winnow
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Named entity recognition through classifier combination
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data
The Journal of Machine Learning Research
Semi-Supervised Learning (Adaptive Computation and Machine Learning)
Semi-Supervised Learning (Adaptive Computation and Machine Learning)
Effective self-training for parsing
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
A unified architecture for natural language processing: deep neural networks with multitask learning
Proceedings of the 25th international conference on Machine learning
Learning from labeled features using generalized expectation criteria
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Cross-task knowledge-constrained self training
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Semi-supervised learning for semantic parsing using support vector machines
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Self-training and co-training applied to spanish named entity recognition
MICAI'05 Proceedings of the 4th Mexican international conference on Advances in Artificial Intelligence
Hi-index | 0.00 |
We describe a novel simple and highly scalable semi-supervised method called Word-Class Distribution Learning (WCDL), and apply it task of information extraction (IE) by utilizing unlabeled sentences to improve supervised classification methods. WCDL iteratively builds class label distributions for each word in the dictionary by averaging predicted labels over all cases in the unlabeled corpus, and re-training a base classifier adding these distributions as word features. In contrast, traditional self-training or co-training methods self-labeled examples (rather than features) which can degrade performance due to incestuous learning bias. WCDL exhibits robust behavior, and has no difficult parameters to tune. We applied our method on German and English name entity recognition (NER) tasks. WCDL shows improvements over self-training, multi-task semi-supervision or supervision alone, in particular yielding a state-of-the art 75.72 F1 score on the German NER task.