Preemptive information extraction using unrestricted relation discovery
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
On-demand information extraction
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
StatSnowball: a statistical approach to extracting entity relationships
Proceedings of the 18th international conference on World wide web
An Incremental Knowledge Acquisition Method for Improving Duplicate Invoices Detection
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Efficient Knowledge Acquisition for Extracting Temporal Relations
Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Open information extraction from the web
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Open information extraction using Wikipedia
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
RDRCE: combining machine learning and knowledge acquisition
PKAW'10 Proceedings of the 11th international conference on Knowledge management and acquisition for smart systems and services
Experience with long-term knowledge acquisition
Proceedings of the sixth international conference on Knowledge capture
RDR-based open IE for the web document
Proceedings of the sixth international conference on Knowledge capture
Identifying relations for open information extraction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Situated cognition and knowledge acquisition research
International Journal of Human-Computer Studies
Hi-index | 0.00 |
The World Wide Web contains a massive amount of information in unstructured natural language and obtaining valuable information from informally written Web documents is a major research challenge. One research focus is Open Information Extraction (OIE) aimed at developing relation-independent information extraction. Open Information Extraction systems seek to extract all potential relations from the text rather than extracting a few pre-defined relations. Existing Open Information Extraction systems have mainly focused on Web's heterogeneity rather than the Web's informality. The performance of the REVERB system, a state-of-the-art OIE system, drops dramatically as informality increases in Web documents. This paper proposes a Hybrid Ripple-Down Rules based Open Information Extraction (Hybrid RDROIE) system, which uses RDR on top of a conventional OIE system. The Hybrid RDROIE system applies RDR's incremental learning technique as an add-on to the state-of-the-art REVERB OIE system to correct the performance degradation of REVERB due to the Web's informality in a domain of interest. With this wrapper approach, the baseline performance is that of the REVERB system with RDR correcting errors in a domain of interest. The Hybrid RDROIE system doubled REVERB's performance in a domain of interest after two hours training.