Learning Information Extraction Rules for Semi-Structured and Free Text
Machine Learning - Special issue on natural language learning
The String-to-String Correction Problem
Journal of the ACM (JACM)
Snowball: extracting relations from large plain-text collections
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Learning to construct knowledge bases from the World Wide Web
Artificial Intelligence - Special issue on Intelligent internet systems
Learning information extraction patterns from examples
Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing
Extracting Patterns and Relations from the World Wide Web
WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Towards the self-annotating web
Proceedings of the 13th international conference on World Wide Web
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Finding parts in very large corpora
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Learning surface text patterns for a Question Answering system
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Unsupervised personal name disambiguation
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Multi-field information extraction and cross-document fusion
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Automatically generating extraction patterns from untagged text
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Information Extraction and Semantic Annotation of Wikipedia
Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Structural, transitive and latent models for biographic fact extraction
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
An ontology-driven rote extractor for pattern disambiguation
Proceedings of the 48th Annual Southeast Regional Conference
Hi-index | 0.00 |
In this paper, we describe a rote extractor that learns patterns for finding semantic relationships in unrestricted text, with new procedures for pattern generalization and scoring. These include the use of part-of-speech tags to guide the generalization, Named Entity categories inside the patterns, an edit-distance-based pattern generalization algorithm, and a pattern accuracy calculation procedure based on evaluating the patterns on several test corpora. In an evaluation with 14 entities, the system attains a precision higher than 50% for half of the relationships considered.