Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Learning dictionaries for information extraction by multi-level bootstrapping
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Unsupervised named-entity extraction from the web: an experimental study
Artificial Intelligence
A bootstrapping method for learning semantic lexicons using extraction pattern contexts
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Learning subjective nouns using extraction pattern bootstrapping
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Unsupervised information extraction approach using graph mutual reinforcement
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Learning domain-specific information extraction patterns from the Web
IEBeyondDoc '06 Proceedings of the Workshop on Information Extraction Beyond The Document
Automatically generating extraction patterns from untagged text
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Hi-index | 0.02 |
Bootstrapping is a weakly supervised algorithm that has been the focus of attention in many Information Extraction(IE) and Natural Language Processing(NLP) fields, especially in learning semantic lexicons. In this paper, we propose a new bootstrapping algorithm called Mutual Screening Graph Algorithm (MSGA) to learn semantic lexicons. The approach uses only unannotated corpus and a few of seed words to learn new words for each semantic category. By changing the format of extracted patterns and the method for scoring patterns and words, we improve the former bootstrapping algorithm. We also evaluate the semantic lexicons produced by MSGA with previous bootstrapping algorithm Basilisk [1] and GMR (Graph Mutual Reinforcement based Bootstrapping) [4]. Experiments have shown that MSGA can outperform those approaches.