Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Learning Information Extraction Rules for Semi-Structured and Free Text
Machine Learning - Special issue on natural language learning
Learning dictionaries for information extraction by multi-level bootstrapping
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Learning pattern rules for Chinese named entity extraction
Eighteenth national conference on Artificial intelligence
A Global Rule Induction Approach to Information Extraction
ICTAI '03 Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence
Unsupervised learning of soft patterns for generating definitions from online news
Proceedings of the 13th international conference on World Wide Web
A bootstrapping approach to named entity classification using successive learners
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Counter-training in discovery of semantic patterns
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Mining soft-matching rules from textual data
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Adaptive information extraction from text by rule induction and generalisation
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Automatically generating extraction patterns from untagged text
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Generic soft pattern models for definitional question answering
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Soft pattern matching models for definitional question answering
ACM Transactions on Information Systems (TOIS)
ARE: instance splitting strategies for dependency relation-based information extraction
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A local tree alignment-based soft pattern matching approach for information extraction
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Combining relations for information extraction from free text
ACM Transactions on Information Systems (TOIS)
Template-based information extraction without the templates
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Multi event extraction guided by global constraints
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Hi-index | 0.00 |
Current rule induction techniques based on hard matching (i.e., strict slot-by-slot matching) tend to fare poorly in extracting information from natural language texts, which often exhibit great variations. The reason is that hard matching techniques result in relatively high precision but low recall. To tackle this problem, we take advantage of the newly proposed soft pattern rules which offer high recall through the use of probabilistic matching. We propose a bootstrapping framework in which soft and hard matching pattern rules are combined in a cascading manner to realize a weakly supervised rule induction scheme. The system starts with a small set of hand-tagged instances. At each iteration, we first generate soft pattern rules and utilize them to tag new training instances automatically. We then apply hard pattern rule induction on the overall tagged data to generate more precise rules, which are used to tag the data again. The process can be repeated until satisfactory results are obtained. Our experimental results show that our bootstrapping scheme with two cascaded learners approaches the performance of a fully supervised information extraction system while using much fewer hand-tagged instances.