NAGA: Searching and Ranking Knowledge

  • Authors:
  • Gjergji Kasneci;Fabian M. Suchanek;Georgiana Ifrim;Maya Ramanath;Gerhard Weikum

  • Affiliations:
  • Max-Planck Institute for Informatics, Stuhlsatzenhausweg 85, 66123 Saarbruecken, Germany. kasneci@mpi-inf.mpg.de;Max-Planck Institute for Informatics, Stuhlsatzenhausweg 85, 66123 Saarbruecken, Germany. suchanek@mpi-inf.mpg.de;Max-Planck Institute for Informatics, Stuhlsatzenhausweg 85, 66123 Saarbruecken, Germany. ifrim@mpi-inf.mpg.de;Max-Planck Institute for Informatics, Stuhlsatzenhausweg 85, 66123 Saarbruecken, Germany. ramanath@mpi-inf.mpg.de;Max-Planck Institute for Informatics, Stuhlsatzenhausweg 85, 66123 Saarbruecken, Germany. weikum@mpi-inf.mpg.de

  • Venue:
  • ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Web has the potential to become the world's largest knowledge base. In order to unleash this potential, the wealth of information available on the Web needs to be extracted and organized. There is a need for new querying techniques that are simple and yet more expressive than those provided by standard keyword-based search engines. Searching for knowledge rather than Web pages needs to consider inherent semantic structures like entities (person, organization, etc.) and relationships (isA, locatedIn, etc.). In this paper, we propose NAGA, a new semantic search engine. NAGA builds on a knowledge base, which is organized as a graph with typed edges, and consists of millions of entities and relationships extracted from Web-based corpora. A graph-based query language enables the formulation of queries with additional semantic information. We introduce a novel scoring model, based on the principles of generative language models, which formalizes several notions such as confidence, informativeness and compactness and uses them to rank query results. We demonstrate NAGA's superior result quality over state-of-the-art search engines and question answering systems.