PIDGIN: ontology alignment using web text as interlingua

  • Authors:
  • Derry Wijaya;Partha Pratim Talukdar;Tom Mitchell

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, Pennsylvania, USA;Carnegie Mellon University, Pittsburgh, Pennsylvania, USA;Carnegie Mellon University, Pittsburgh, Pennsylvania, USA

  • Venue:
  • Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of aligning ontologies and database schemas across different knowledge bases and databases is fundamental to knowledge management problems, including the problem of integrating the disparate knowledge sources that form the semantic web's Linked Data [5]. We present a novel approach to this ontology alignment problem that employs a very large natural language text corpus as an interlingua to relate different knowledge bases (KBs). The result is a scalable and robust method (PIDGIN) that aligns relations and categories across different KBs by analyzing both (1) shared relation instances across these KBs, and (2) the verb phrases in the text instantiations of these relation instances. Experiments with PIDGIN demonstrate its superior performance when aligning ontologies across large existing KBs including NELL, Yago and Freebase. Furthermore, we show that in addition to aligning ontologies, PIDGIN can automatically learn from text, the verb phrases to identify relations, and can also type the arguments of relations of different KBs.