Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Semi-supervised conditional random fields for improved sequence segmentation and labeling
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Confidence estimation for information extraction
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
One class per named entity: exploiting unlabeled text for named entity recognition
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Crosslingual distant supervision for extracting relations of different complexity
Proceedings of the 21st ACM international conference on Information and knowledge management
Learning multilingual named entity recognition from Wikipedia
Artificial Intelligence
ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part I
A joint model to identify and align bilingual named entities
Computational Linguistics
Hi-index | 0.00 |
We present a simple semi-supervised learning algorithm for named entity recognition (NER) using conditional random fields (CRFs). The algorithm is based on exploiting evidence that is independent from the features used for a classifier, which provides high-precision labels to unlabeled data. Such independent evidence is used to automatically extract high-accuracy and non-redundant data, leading to a much improved classifier at the next iteration. We show that our algorithm achieves an average improvement of 12 in recall and 4 in precision compared to the supervised algorithm. We also show that our algorithm achieves high accuracy when the training and test sets are from different domains.