Ontology-aware partitioning for knowledge graph identification

Authors:
Jay Pujara;Hui Miao;Lise Getoor;William W. Cohen
Affiliations:
University of Maryland, College Park, Maryland, USA;University of Maryland, College Park, Maryland, USA;University of Maryland, College Park, Maryland, USA;Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
Venue:
Proceedings of the 2013 workshop on Automated knowledge base construction
Year:
2013

Citing 8
Cited 0

A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
Yago: a core of semantic knowledge

Proceedings of the 16th international conference on World Wide Web
Open information extraction from the web

Communications of the ACM - Surviving the data deluge
Organizing and searching the world wide web of facts - step one: the one-million fact extraction challenge

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Pregel: a system for large-scale graph processing

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Large-scale cross-document coreference using distributed inference and hierarchical models

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Collective graph identification

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning to Refine an Automatically Extracted Knowledge Base Using Markov Logic

ICDM '12 Proceedings of the 2012 IEEE 12th International Conference on Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Knowledge graphs provide a powerful representation of entities and the relationships between them, but automatically constructing such graphs from noisy extractions presents numerous challenges. Knowledge graph identification (KGI) is a technique for knowledge graph construction that jointly reasons about entities, attributes and relations in the presence of uncertain inputs and ontological constraints. Although knowledge graph identification shows promise scaling to knowledge graphs built from millions of extractions, increasingly powerful extraction engines may soon require knowledge graphs built from billions of extractions. One tool for scaling is partitioning extractions to allow reasoning to occur in parallel. We explore approaches which leverage ontological information and distributional information in partitioning. We compare these techniques with hash-based approaches, and show that using a richer partitioning model that incorporates the ontology graph and distribution of extractions provides superior results. Our results demonstrate that partitioning can result in order-of-magnitude speedups without reducing model performance.