Extracting multilingual natural-language patterns for RDF predicates

Authors:
Daniel Gerber;Axel-Cyrille Ngonga Ngomo
Affiliations:
Institut für Informatik, AKSW, Universität Leipzig, Leipzig, Germany;Institut für Informatik, AKSW, Universität Leipzig, Leipzig, Germany
Venue:
EKAW'12 Proceedings of the 18th international conference on Knowledge Engineering and Knowledge Management
Year:
2012

Citing 8
Cited 5

Distant supervision for relation extraction without labeled data

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Hierarchical joint learning: improving joint parsing and named entity recognition with non-jointly labeled data

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
SemEval-2010 task 5: Automatic keyphrase extraction from scientific articles

SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Scalable knowledge harvesting with high precision and high recall

Proceedings of the fourth ACM international conference on Web search and data mining
Introduction to linked data and its lifecycle on the web

RW'11 Proceedings of the 7th international conference on Reasoning web: semantic technologies for the web of data
SCMS: semantifying content management systems

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part II
DBpedia spotlight: shedding light on the web of documents

Proceedings of the 7th International Conference on Semantic Systems
Identifying relations for open information extraction

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing

DeFacto - deep fact validation

ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
Sorry, i don't speak SPARQL: translating SPARQL queries into natural language

Proceedings of the 22nd international conference on World Wide Web
Question answering on interlinked data

Proceedings of the 22nd international conference on World Wide Web
Knowledge-based graph document modeling

Proceedings of the 7th ACM international conference on Web search and data mining
Generating SPARQL queries using templates

Web Intelligence and Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most knowledge sources on the Data Web were extracted from structured or semi-structured data. Thus, they encompass solely a small fraction of the information available on the document-oriented Web. In this paper, we present BOA, a bootstrapping strategy for extracting RDF from text. The idea behind BOA is to extract natural-language patterns that represent predicates found on the Data Web from unstructured data by using background knowledge from the Data Web. These patterns are then used to extract instance knowledge from natural-language text. This knowledge is finally fed back into the Data Web, therewith closing the loop. The approach followed by BOA is quasi independent of the language in which the corpus is written. We demonstrate our approach by applying it to four different corpora and two different languages. We evaluate BOA on these data sets using DBpedia as background knowledge. Our results show that we can extract several thousand new facts in one iteration with very high accuracy.